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I. INTRODUCTION 

Databases are essential to an organization's information 
system. The information system supports the organization's 
functions, maintaining the data for these functions and 
assisting users to interpret the data for decision making. 
The database has a central role in this process. Database 
structures must be flexible to meet changing organizational 
needs. As new functions arise in an organization, new 
decisions follow in their wake. It should include facilities 
to allow the changes to be easily made. Characteristics of 
the database system will be discussed in the Chapter 2. 

Meanwhile, it is not easy to develop database systems 
which perform in an cptimal fashion. Different users will 
have different request about structuring data in the 
database. it is hard to satisfy all of the users with one 
type of structuring. There are different ways in which data 
can tke structured. For that reason, in the database 
development phase ail requests which come fron 
usersyorganizations should be evaluated carefully by the 
designer (s). 

Fer logical design of the inventory datakase the 
Semantic Data Model will be used. After that, the normal 
form ccncept of the Relational Database will be used to. 
develop an inventory database. 

Chapter 3 describes the basic concepts of database 
design which includes the lcgical and physical database 
design, and database mcdels. Chapter 4 addresses the design 
of SDM and specifications of SDM. Chapter 5 describes how 
the inventory database is design by using the SDM@. Chapter 
6 addresses the basis structure of the relational model 


which includes functional dependency and normal form 
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concepts of the relational model. Chapter 7 describes the 
relational design criteria and relational design procedure. 
Also, in this chapter SDM for the inventory database will be 
transformed into a relational model. In Chapter 8, aSa 
relational approach to datakase systen, System R is 
described which contains architecture and system structure, 
the relational data systen, and the relational storage 
system of System R. Chapter 9 describes the implementation 
of the inventory datakase by using ORACLE. Finally, Chapter 
10 addresses the ccnclusions and recommendations of this 


thesis. 
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II. BASIC CONCEPT OF DATABASE 


Aw. WHAT IS A DATABASE? 


Database technology has been described as "one cf the 
most rapidly growing areas of computer and information 
science." As a field, it is still comparatively ycung. 
Despite its youth, however, the field has guickly become one 
of considerable importance, both practical and theoretical. 
Today, Many organizations have become critically dependent 


on the continued and successful operation of a database 


systen. 
Basically, a database is nothing more than a 
computer-based record keeping systen: that is, a system 


whose overall purpose is to record and maintain inforaation 
that may be necessary to the decision-making processes 
involved in management of that organization. Ina database 
the data definitions and the relations between the data are 
separated from the prcecedural statements of a program. The 
gquesticn to be asked here is,"What is the major distinction 
between a database and a data file?" A database may have 
more than one use, and the multiple uses may satisfy 
multirle "views" of the data stored. A data file may have 
more than one use, but only cne "view" of the stored data 
can be satisfied. Multiple views of a data file can be 
satisfied only after the data have been sorted. In a 
database environment, multiple uses may be the result of 
Multiple users; for example, in a banking environment the 
information about customers may have several users, such as 
Checking, savings, and installment loan. Thus data sharing 
is a major objective of an enterprise database system. A 
datakase system involves four major components: 


data, hardware, software, and users. 


AZ 


1. Tata 


A database is a repository for shared data. Ina 
general, itis both integrated and shared. " Integrated " 
means that the datakase may be thought of as a unification 
ef several otherwise distinct data files, with any 
redundancy among those files partially or wholly eliminated. 
"Shared “" means that individual pieces of data in the 
database may be shared among several users, in the sense 
that each of those users may have access to the same fiece 
of data. Such sharing is really a consequence of the fact 
that the database is integrated. The term "shared" is 
frequently extended tc cover not only Sharing as descrited 
above, but also concurrent sharing: that is, the ability of 
several different tsers_ to be actually accessing the 


datapase at the same time. 


Ze Hardware 


The hardware consists of the secondary storage 
volumes - disks,drums,etc ~-on which the database resides, 
together with the associated devices, control units, 


Channels, and so forth. 
3- software 


Between the physical database itself (i.e, tne data 
as actually stored) andthe users of the system is a layer 
cf software,usually called the database management system or 
DBMS. A database management system makes it possible to 
access integrated data that crosses operational, functional, 
Or Organizational boundaries within an enterprise. AS an 
example of a Relaticnal DBMS System R will be evaluated in 
Chapter 8. 
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YW. Users 


Three classes of uSers can be-. considered. First, 
there is the application programmer, responsible for writing 
application programs that use the database, typically ina 
language such as COBCI or PL/I. The second class of user is 
the end-user, accessing the database from a terminal. An 
end-user can use a query language which as an integral part 
of the system. The third class of user is the database 
administrator, or ICPA who is the person (or a_ group of 
persons ) responsible for overall control of the database 


systen. 


Be OPERATIONAL DATA 


Any enterprise such as a bank, hospital, university, or 
company must necessarily maintain large amounts of data 
about itS operaticn, termed "operational data". The 
operaticnal data for the enterprises would probably include 
account data, patient data, student data, product data, and 
planning data. Operational data does not include input or 
output data, work queues, or indeed any purely transient 
information. Input data refers to the information entering 
the system from the outside world; such information may 
cause a change to be made to the operational data but is not 
itself part of the database. Output data refers to messages 
and reports emanating from the systen; such a report 
contains information derived from the operational data, kEut 


is not itself part of the database. 


C. WHY LATABASE ? 


The Eroad answer to this question is that a database 
system provides the enterprise with centralized control of 


its operational data. This 1S in sharp contrast to the 
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Situation that prevails in many enterprises today where 
typically each application has its own files so that the 
operational data is widely dispersed, and therefore protably 
difficult to control. In the database system, the DBA has 
this central responsibility for the operational data. Some 
of the advantages that accrue from having centralized 
control cf the data are described below. 


1. Advantages of Database Systems 


An important advantage of database processing is the 
elimination or reduction of data duplication. In nondatabase 
System each applicaticn has its own private files. This can 
often lead to considerable redundancy in stored data, with 
resultant waste in stcrage space. In the database, it need 
cnly be recorded once. Elimination of duplication saves file 
Space and to some extent can reduce processing requirements. 
In some cases there may be some business reascns_ for 
Maintaining multiple copies of the same data. In the 
database, however, redundancy should be controlled. The most 
serious froblem of data duplication is that it can lead to a 
lack of data integrity. A common result of a lack of 
integrity is conflicting reports. 

Data integration offers several important 
advantages. First and foremost, database processing enables 
more information to be produced from a given amount of data. 
Data are recorded facts or figures; information is Knowledge 
gained ky processing data. 

Creation of oprogram/data independence is ancther 
advantage of a database systen. For the database 
application, application programs will obtain data from an 


intermediary, tne DBMS. The application programs need not 


contain data structure, only the DBMS will need this 
Schucture. Another advantage of database processing is 
ketter data management. When data is centralized in a 


datakase, cne department can specialize in the maintenance 
of the data. That department can specify data standards and 
ensure that ali data adhere to the standards. 

Database processing creates another type of eccncny 
of scale. Since there is only one DBMS processing a shared 
database, improvements made to the database or to the DBUHS 


will tenefit many uSsers. 


2. Disadvantages of Database Systems 


A major disadvantage of database processing is that 
it can be expensive. The DBMS may cost as much as $100,000 
to buy. The database management system may occupy so much 
Main memory that additional memory must be purchased. Even 
with more memory, it may monopolize the CPU, thus forcing 
the user to upgrade to a more powerful computer. Conversion 
from existing systems can be costly, especially if new data 


must ke acquired. 


Another major disadvantage is that the database 
processing tends to ke complex. Large anounts of data in 
many different formats can be interrelated in the datakase. 
Both the database system and application programs must be 
able tc process tkese structures, requiring more 
sophisticated programming. Backup and recovery are difficult 
in the database envircnment because of increased complexity 
and kecause databases are often processed by several users 
concurrently. Determining the exact state of the database 
at the time of the failure may be a problem. A_ final 
disadvantage is that integration, and hence centralization, 
increases vulnerability. A failure in one component cf an 


integrated system can affect the entire systen. 
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De. DATA INDEPENDENCE 


In the conventional data Set environment, the 
application programmer has to know answers to the following 
guestions before manifulating the data : 

1. What is its fcrmat? 

2. Where is it located? 

3. How is it accéssed? 

Changes in any of these three items may affect the 
application program and result in other changes, since the 
details of these three pointS may reside in the application 
code. The users of the database system should be oriented 
toward the informaticn content of the data and should nct be 
concerned with detaiis of the representation and location. 
The ability to use the database without knowing the 
representation details is called DATA INDEPENDENCE. Data 
independence provides that the individual application 
programmer no longer must change the application programs to 
accommodate changes in access method or location or fcrmat 
of the data. The reasons for data independence are as 
follows: 

1. To allow the DBA to make changes inthe content, 
iccation, representation and Sbdanazdti Cn or a 
database without causing reprogramming of application 
programs which use the database. 

2. To allow the supplier of data processing eguirament 
and software to introduce new technologies witsnout 
CcauSing reprogramming of the customer's application. 

3. T0 facilitate data sharing Dy ailowiag the same data 
tc appear to Le organized differently for different 
application prcgrans. 

4. To simplify appiication program development and,in 
particular, to facilitate the development cf frograms 


for interactive database processing. 
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5. te provide the centralization of control needed by 
the DBA to insure the security and integrity ef the 
datakase. 


E- DATA LICTIOMARY 


A data dictionary is a central repository of information 


about the entities, the data elements representing the 
entities, the relationships between the entities, their 
origins, meanings, uses, and representation formats. A 


facility that provides uniform and central information about 
all the data resources is called a DATA DICTIONARY (DD). The 
benefits of uSing a data dictionary are related tc the 
efrective collection, specification, and management of the 
total data resources of an enterprise. A data dicticnarv 
should help a database user in: 
1. Communicating wito other users. 
2e Controlling the data elements in a simple and 
effective manner; that is, introducing new elements 
into the system, or changing the descriptions of the 
elements. 
3. Reducing the data redundancy and inconsistency. 
4. Determining the impact of the changes to the data 
€lements on the total database. 
5. Centralizing the control of the data elements as an 


aid in database design and in expanding the design. 
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Tit. DATABASE DESIGN 

A database is the interface between people and machines. 
The nature of these components is very different. The 
difficulty is to develop a database design which meets the 
needs of the people who will use it and which is practical 
in term of technology and hardware. Since the database is 
the bridge between humans on one side and hardware on the 
other, it must match the characteristics of each. 

There is no algcrithm for database design. Database 
design is both art and science. Dealing with people, 
understanding what they want today, predicting what they 
will want tomorrow, differentiating between individual needs 
and community needs, and making appropriate design tradeoffs 
are artistic tasks. There are principles and tools, but 
these must be used in conjunction with intuition and guided 
ky experience. 

Database design is a two-phased process. The first fhase 
of the database design is usually called the Logical 
Database Phase in which the designer examines the users!’ 


requirements and builds a conceptual database structure that 


is a model of the organization. Once the logical design of 
the datarase is completed, this design is formulated in 
Menms CL a particular DBMS. Usually compromises must be 


Made. The process of formulating the logical design in terms 
of a PBMS facility is called Physical Database Design. This 
Chapter considers both phases of the database design. 


A. LOGICAL DATABASE LESIGN 


mVDLCAaLLYy, database design is an iterative frocess; 


during ¢ach iteraticn, the goal is to get closer to an 
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acceptable design. Thus a design will be developed and then 
reviewed. Defects in the design will be identified, and the 
design will be revised. This process is repeated until the 
development team and users can find no major defects. This 
does not mean the design will work; it Simply means no one 
can think of any reason why it will not work. Figures 
illustrates the StCps in dmc pica: database design 
project{ Ref.4]. 
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User requirements are studied and a logical database design 
is developed. Concurrently, the preliminary design of the 
datakase processing frograms is produced. Next, the logical 
database and the preliminary program designs are used to 
. develop the physical database design and the detail progran 
design specifications. Finally, both of these are input to 


the implementation phase of the project. 
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1. Inputs to Logical Datakase Design 


The inputs tc the logical database design are the 
system requirements and the project plan. Reguirements are 
determined by interviews with users, and then they are 
approved Fy both users and management. The project plan 
describes the system environment, the development plan, and 
constraints and limitations on the system deSign. Policy 
Statements can be used to develop the descriptions of the 


logical database design. 


2. Qutputs of the Logical Database Design 


A logical database design specifies the logical 


format of the database. The records to be maintained, their 


contents, and relationships among these records are 
specified. 
To specify logical records, the designer must 


specify the levels of the detail of the database model. If 
the model is highly aggregated and generaiized, there will 
ke few records. If the model is detailed, there will be many 
records. The designer must examine the requirements to 
detertine how coarse or how fine the database model should 
be. The contents of these are specified during logical 
design. Names of fields and their formats must be 
determined. As the requirements are evaluated and the design 
progresses, constraints on data items will be identified. 
These are limitations on the values that data can have. 
These types of constraints are common. Field constraints 
limit the values that a given data item can have. 
Intrarecord constraints limit values between fields within a 
given record. Also, record relationships are specified 
during lcgical design. The designer studies the application 
envircnment, examines the requirements, and identifies 


necessary relationships. Finally, output of the logical 
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database design is the specification of the database 


records, their contents, constraints, and relationships. 


3. Stages of Logical Datakase Design 





Many technigues have been developed for logical 
database design. Scme technigues are completely intuitive 
and others involve specific procedures for processing a data 
dictionary. Others are between these extremes. The major 


steps in the logical database deSign are as follows. 
ae Identify [Tata to be Stored 


First, the data dictionary is processed and data 
that is to be stored is identified and segregated. This step 
is necessary because the data dictionary will contain the 
description of the reports, screens, and input documents 
that will not be part of the database. 


ke Consolidate and Clarify Data Names 


The next step is to clarify the terms used for 
the data. One task is to identify synonyms, to decide on 
Standard names for synonyms, and the record aliases. 
Synonyms are two or mere names for the same data item. They 
arise because of tke terminology differences within the 
organization. In this case the designer will need to select 
a single , Standard name for the data item in the logical 
schema of the database. In some caseS synonyms can not be 
eliminated because the users want to maintain their own 
terminology. 

Another task related to terminology is to ensure 
that data items havinc the same name are truly the same. Ti 
not, unigue data item names must be developed. Consider the 
data item DATE. This can be the date of shipment, the date 
of employee terminaticn, or the date of order. The designer 
must determine if all of the uses of the DATE item are the 


Same. If not, new and unigue names must be determined. 
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c. Develop the Logical Schema 


ti eeentideatepe an the design process is to 
develcp the logical schema by defining records and 
relationships. Records are defined by determining the data 
items they will contain. The designers examine the data flow 
diagrams and data dictionary, apply intuition to the 
business setting of the new systen, and determine that 
certain records will heed to exist. After this 
determination, some of the files may need to be combined and 
some of them may not. 

The second step in developing the logical schema 
is to determine the relationships among database records. At 
that point, representation of the relationships by the 
datarase system is nct important. Instead, the design team 
wants to model how the users see the relationships. We do 
not need to consider physical limitations at this point. 
Doing so makes the logical schema too complex and may 
constrain our thinking so that we niss good design 
aiternatives. At tate Ol nt, the design team must 
discriminate between theoretical and useful relationshifs. A 
theoretical relationship can exist logically, but never be 
needed in practice. In general, if there is any question 
regarding whether a relationship is useful or not, then the 
relationship should be included in the logical schema. the 
relationship always can be omitted later inthe fhysical 
design, whereas if the relationship were omitted during 


logical design, it weuld be difficult to add later. 
d. Define Processing 


The next step is to define the processing cf the 
database. The requirements are examined to determine how the 
database should be manipulated to produce required results. 


The processing definitions can be developed in several ways. 
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Cne method is to describe transactions and data to be 
modified. Another method is to develop structure charts of 
the programs that will access the database. This precess is 
important because ccncurrent design of the programs and 
database will improve the database design. It 1s also clear 


that concurrent design improves the quality of programs. 
e. DeSign Review 


The final stage of the logical database design 
is a review. The logical schema and users' views are 
examined in the light of the reguirements and prcgran 
descriptions. Every attempt is made to identify omissicns 
and unworkable aspects of the deSign. Typically, a panel of 
independent data prccessing people is convened for this 
review. Documentation of the logical schema, users' views, 
and program descripticns are examined by the panel, and oral 
presentations are evaluated. 

At the ccnclusion of the design review, the 
panel produces a list of problems discovered and a 


recommendation regarding the next step to be taken. 


Be PHYSICAL DATABASE DESIGN 


The second stage of the database design 1S physical 
design which is a stage of the transformation. The logical 
schema is transformed into the particular data constructs 
that are available with the DBMS to be used. As mentioned 
before, the inputs tc the physical database deSign are the 
cutputs of the logical database design, the systen 
requirements, and the preliminary deSign of programs. 
Whereas the logical design is DBMS independent, the physical 
design is very much [EMS dependent. Detail specificaticn of 
the datarkase structure is prcduced. These specificaticns 


will be used during database implementation to write scurce 
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Statements that define the database structure to the DEMS. 
These statements will be compiled by the DBMS and the okjyect 
form cf the database structure wili be stored withir the 


database as shown in Figure 3.2 [Ref.4]. 
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Figure 3.2 Physical Design Process. 


1. Physical Design Steps 


Fractical exprerience has shown that neither the 
Starting point nor the order of steps can be definitely 
stated fcr a given design problen. On the other hand, the 
physical design phase can be regarded as an iterative 
process of initial design and requirement. Each step needs 
to ke performed several times, but succeeding analysis 
shouid be done more quickly because the procedure is known 
and the number of unchanging performance variables should 
increase bketween iterations. Steps of physical design are as 
follows. 
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ae Stored Record Format Design 


Assuming that the logical record structure has - 
been defined, this process addresses the problen of 
formatting stored data by analysis of the characteristics of 
data iter types, distribution of their values, and their 
usage by various applications. Certain data items are cften 
accessed more frequently than others, but each time a 
particular piece of data is needed, the entire stored 
record, andaill stored records ina physical block as well, 
must ke accessed. kecord partitioning defines an allocation 
of individual data items to sefarate physical devices of the 
same or different types, or separate extents on the same 
device, so that the tctal cost of accessing data for a given 
set of user applications iS minimized. Logically, data items 
related to a single entity are still considered to be 
connected, and physically they can still be retrieved 


together when necessary. 
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tored Record Clustering 





Record clustering refers to the allocation of the 
records of different types into physical clusters to take 
advantage of physical segquentiality whenever possible. 
Associated with beth record clustering and record 
partitioning is the selection of physical block size. Blocks 
in a given clustered extent are influenced somewhat by 
stored record size, tut also by storage characteristics of 
physical devices. Choice of Elock size may be sukject to 


conSiderable revision during an iterative design process. 


Se Access Method Design 


An access method provides storage and retrieval 
Capabilities for data stored on physical devices, usually 


secondary storage. The two critical components of an access 
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method are storage structure and search mechanism. Storage 
structure defines the limits of possible access. faths 
through indexes and stored records, and search mechanisms 
define which paths are to be taken for given applications. 
Access method design is often defined in terms of primary 
and secondary access path structure. The primary access 
paths are associated with initial record loading, Or 
placement, and uSually involve retrieval via the primary 
key. Secondary access paths include interfile linkages and 
alternate entry-point access to stored records via indexes 
and secondary keys. The trade-off is that access time can be 
greatly reduced thrcugh secondary indexes, but at the 
expense of increased storage space overhead and index 
Maintenance. 

A fourth ster of physical design trade-offs among 
integrity, security, and efficiency reguirements alsc should 


be considered. 
a. Program Lesign 


The goal cf the physical data independence, if 
net, produces application program modification due to 
physical structure design decisions. Standard DBMS routines 
Should ke used for all accessing, and query or urfdate 
transaction optimization Should be performed at the svsten 
software level. Then, application program design shouid be 
completed when the logical database structure is known. When 
physical data independence 1S not guaranteed, program 


modification is likely. 


4. Physical Design Environment 


The design environment is basically tke sane for 
both file design and physical database design. Major 
categories of inputs and outputs for the physical design 


phase are illustrated in Figure 3.3. The logical database 
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structure resulting from the implementation design phase 
defines the framework from which the physical designer 
works. If no catastrophic inefficiency is detected, it will 
remain unchanged during physical design. In general, new 
parameters will be considered, but previous tentative 
decisions on access paths and record allocation are 
finalized in this phase. New parameters are those specific 
to DBMS and operating system access methods, those specific 
to describe physical device capacity limitations and timing 
characteristics, and all operational requirements which are 
constraints imposed cn integrity, security, and response 
time under Static conditions and for dynamic growth 
projecticns. During the design process, consideration of 
efficiency issues can take place only the after varicus 
constraints are satisfied anda feasible solution has been 


obtained. 


5. Performance Measures 





The determination of performance measures for 
phySical design is most critical to the design process. They 
affect not only the design choices, but also the techniques 
employed to determine those choices. 

Multiple performance measures provide the designer 
with flexibility for decision making for both the initial 
design procedure and for future modifications. If we 
describe the database system performance in terms of cost we 
should ccnsider life cycle cost in terms of following items: 

1. Planning cost 

2. DeSign cost: programs, databases 

3. Inplementation and testing cost:programs, databases 

4. Operational cost:users, computer resources 

5- Maintenance ccst:program errors, data integrity loss. 
The major Froblem that the physical database 


designer must address is how to minimize present and future 
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Figure 3.3 Physical Design Environment. 


operational costs in terms of user needs and comruter 
resources. The remainder of the life cycle phases' costs are 
well defined for general software systems. Operaticnal costs 
are unigue to physical design and can be categorized as 
follows: 


1. Query response time 

2. Update transaction cost 

3. Report generation cost 

4. Reorganization freguency and cost 
5. Main storage ccst 


6. Secondary storage cost 
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Each of these components is important to the 


designer; typical considerations are shown in Figure 3.4. 
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Figure 3.4 Cuery Response Time Components. 


C. CATAEASE MODELS 


A database model is vocabulary for describing the 
structure and processing of the database. Taere are two 
reasons for studying database models. First, they are 
important database design tools. Database models can be used 
for toth logical and physical database design - much as 
flowcharts or pseudccode are used for programs deSign. 
Second, database mcdels are used to categorize DEMS 
products. Database models have two major components. First, 


the data definition language (DDL)is a vocabulary for 
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defining the structure of a database. The DDL must include 
terms for defining records, fields, keys, and relationshifs. 
Also it should provide a facility for expressing a variety 
of user views. AS a second component, data manipfulation 
language (DML) is a vccabulary for describing the processing 
of the database. Two types of DML exist: procedural and 
nonprecedural DML. Facilities are needed to retrieve and 
change data for both. Procedural DML is language for 
describing actions te be performed on the database. at 
obtains a desired result by specifying the operations to be 
performed. Nonprocedural DML is language for descriting the 
data that 1s wanted without describing how to obtain it. 

Figure 3.5 iilustrates six common and useful database 
models. The models are arranged ona continuum. Models on 
the left-hand side cf this figure tend to he oriented to 
humans and human meaning, whereas those on the right-hand 
side are more oriented toward machines and machine 
specifications[ Ref.4: pf. 215]. 

The primary purfpese of this thesis is to design and 
implement an inventory database. For the logical design of 
this database, the Semantic Data Model (SDM) will Le used 
and fer physical design a Relational Model will be employed. 
For this reason, SCM and the Relational model will be 


discussed in detail in the following chapters. 


By, 
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Figure 3.5 Relaticnships of Six Important Data Model. 


Aw. INTRODUCTION 


The Semantic Data Model was developed by M. HAMMER and 
Dis McLOED in 1981[ Ref. 13}. SoM is a high-level 
semantics-Lased database description and Structuring 
formalisnz for databases. This database model is designed to 
capture more of the meaning of an ‘application environment 
than is possible with contemporary database models. An SD®™ 


specification describes a database in terms of the kinds of 


entities that exist in the application environment, the 
classifications and groupings of those entities, and the 
structural interconnections among then. SDM provides a 


collection of high-level modeling primitives to capture the 
semantics of an application environment. SDM is designed to 
enhance tke effectiveness and usability of database systems. 
mae 6SDM database description can serve as a formal 
specification and documentation tool for a database. 

Every database is a model cf some real world system. The 
contents of a database are intended to represent a snarshot 
of the state of an application environment and each change 
to the database should reflect an event occuring in that 
environment. It 1S appropriate that the structure of the 
database mirror the structure of the system that is teing 
modelled{Ref.13]. A database whose organization is based on 
naturally occuring structures will be easier for a database 
designer to construct and modify than one that forces him to 
translate the primitives of his problem domain into an 
Seer icial Specification construct. Similarly, a database 
user sShculd find it easier to understand and empicy a 
database if it can ke described to him using concepts with 


which he is already familiar. 


318 


Ccntemporary datatase models provide the data structures 


which do not adequately support the design, evolution, and 


use of complex databases. These models have significantly 
limited capabilities for expressing the meaning of a 
database. The semantics of a database defined ir terms of 
these mechanisms are not readily apparent from the schema 
which is the global tser view ofa database; instead, the 
semantics must be separately specified by the database 


designer and consciously applied by the user. 


Be THE CESIGN OF SDB 


SDM has been defined with a number of specific kinds of 
uses in hind. First, SDM is meant to serve as a formal 
specification mechanism for describing the meaning cf a 
datarase; an SDM schema provides a precise documentation and 
communication medium for database users. For a new user of a 
complex database, it is easy to find out what information is 
contained in the datakase. Second, SDM provides the basis 
for a variety of high-level semantics-based user interfaces 
to a database; these information facilities can be 
constructed as front-ends to existing database management 
systems, or aS the query language of anew DBMS. Such 
interfaces improve the process of identifying and retrieving 
relevant information from the database. PoRal sve SDM 
provides a foundaticn for supporting the effective and 
structured design oof databases and database intensive 
application systems. 

SDM has been designed to satisfy a number of criteria 
that are not met by contemporary database models, but are 
essential in an effective database descripticn and 


structuring formalism. They are as follows[ Ref.13]}: 


"The constructs, of database model should provide for the 
explicit specification of a large portion of the meanin 
of a database. Many contemporary database models (suc 
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as _ the CODASYL DBIG network model and the hierarchical 
model) exhibit compromises between the desire to provide 
a user-oriented database organization andthe. need _ to 
Support. efficient database storage and_ manipulation 
Facilities. By contrast, the relational database model 
stresses the separation - of US er = eve) database 
specifications and underlying implementation detail(data 
independence). However, the semantic expressiveness of 
the hierarchical mcdel, network and relational models 
are limited; they do not provide sufficient mechanism@s. 
to allcw a databaSe schema tc describe the meaning of a 
database. They employ overiy simple data structures to 
model an eee en environment. nh so doing, they 
inevitably Lose information about the database. is ic 
a conseguence of the fact that _their structures are 
essentially ar record-oriented constructs; the 
appropriateness and adequacy of the record construct for 
expressing database semantics 1s. highly limited. It is 
essential that the database model provide a rich set of 
features to allow the direct modeling of application 
environment semantics. A database model must Ses a 
relativest view of the ere of a database , and allow 
the structure of a database to support alternative ways 
of locking at the same information. Flexibility 1s 
essential in order to allow for multiple and coequal 
views of the data. In a logically redundant database 
schema the values. of some database components. can be 
algcerithmically derived from others. Incorporating such 
defived ainforhaticr into a schema can Simplify the 
user's manipulation of a dat ads wmeoy eo e dite Ly 
eee in the schema data values that woulic otherwise 
have to be eyeanecelty and BS | COMPO E eS: Se 
an integrated schema |. explicitly escribes the 
relaticnships and similarities between multiple ways of 
viewing the same information. Contemporary database 
models do. not adeguately support relativisn. In these 
models, it 1s generally necessary to impose a singie 
Seauectural organization or the data, one which 
inevitably | carries along with The a Pabercu lar 
interpretation of the data'sS meaning." 


A database model gust support the definition of schemata 
that are based on atkstract entities. Specifically, this 
means that a database model must facilitate the descriftion 
of relevant entities in the application environnent, 
collections of such entities, relationships among entities, 
and structural interconnecticns among the collections. 
Moreover, the entities themselves must be distinguished from 
their syntactic identifiers; the user-level view of database 
Should be based on actual entities rather than on artificiai 
entity names. Allowing entities to represent themselves 
Makes it possible to directly reference aneentity froma 


related one. In reéecord-oriented database models, it is 
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hecessary to cross’ treference between related entities by 


means of their identifiers[ Ref.11]. 


C. A SPECIFICATION OF SDA 


The following general principles are specified by McLoed 


and Hammer in 1981: 


ee 
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A database is to be viewed as collections of entities 
that correspcnd to the actual objects in the 
application environment. 

The entities in a database are organized into classes 
that are meaningful collections of entities. 

The classes of a database are not generally 
independent, but rather are logically related by 
means of interclass connections. 

Database entities and classes have attributes that 
describe their characteristics and reiate them to 
other database entities. An attribute value tay be 
derived from cther values in the database. 

There are several primitives for defining interclass 
ccnnections and derived attributes, corresponding to 
the most common types of information redundancy 
appearing in database applications. These facilities 
integrate multiple ways of viewing the same tasic 
information, and provide building blocks for 
describing ccmplex attributes and interclass 


relationships. 


Classes 





AS mentioned above, an SDM database is a collection 


of entities that are organized into classes. Figure 4.1 


Shows a kEasic format cf an SDM entity class description. The 


structure and organization of an SDM database is specified 
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by an SDM schena, which identifies the classes in the 


database. Each class in an SDM schema has the following 


features. 
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A class Name identifies the class. Multifle 
Synonymous names are also permitted. Each class name 
must be unigue with respect to all class names used 
in the schema. 

The class has a collection of members: the entities 
that constitute it. Each class in an SDM schema is a 
hcmogeneous ccllection of one type of entity at an 
appropriate level of abstraction. The entities ina 
class may correspond to various kinds o£ objects in 
the applicaticn environment. 

An optional textual class description descrites the 
meaning and ccntent of the class. A class description 
should be used to describe the specific nature of 
entities that constitute a class and to indicate 
their significance and role in the application 
environment. 

The class has a collection of attributes that 
describes the members of that class or the class as a 
whole. There are two types of attributes, classified 
according to applicability. 

A member attrikute describes an aspect of each memker 
of a class by logically connecting the member to one 
or more related entities in the same cr cther 
classes. A class attribute describes a property of a 
class taken as whole. 

The class is either a base class or a nonbase class. 
A base class is one that is defined independently of 
all other classes in the database; it can be thought 
of as modeling a primitive entity in the application 
environment. Base classes are mutually disjoint in 


that every entity 1s a member of exactly cne base 
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class. A nonkase class is one that does not have 
independent existence; rather, it is defined in terms 
of one or more other classes. In SDH, classes are 
StEUCtTUEay related by means of interclass 
connections. Fach nonbase class has associated with 
it an interclass connection. If class is base class, 
it has an associated list of groups cf member 
attributes; each of these groups serves as a logical 
key to uniquely identify the members of a class. If 
the class is tase class, it 1is specified as either 


containing duplicates or not containing duplicates. 


ae iInterclass Connections 


There are two main types of interclass 
connections in SDM: the first allows subclasses to be 
defined and the second supperts grouping classes. The 


subclass connection specifies that the members of nonbase 
class (S) are of the same basic entity type as those in the. 
class to which it is related via interclass connection. This 
type of interclass connection is used to define a subclass 
of a given class. A subclass S of class C is a class that 
contains some, but not necessarily all, of the members cf C. 
In SDM, a subclass S is defined by specifying a class C and 
a predicate P on the member of C; S consists of just those 
members of C that satisfy P. Several types of predicates are 
permissiktle. A predicate on the member attributes of C can 
be used to indicate which members of C are also members of 
S. The predicate "wkere specified" can be used to define S. 
This means that S ccntains at all times only entities that 
acre members of C. Itis also possible to define sukciass §S 
aS an intersection of database classes ( C1,C2). 

The other type of interclass connection allows 
for the definition of nonbase class, called a grouping class 


(G), whose members are of a higher-order entity type than 
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Figure 4.1 Format of SDM Entity Class Description. 


those in the underlying class (J). A grouping class is 
second ocrder, in the the sense that its members can 
themselves be viewed as classes; 1n particular, they are 


classes whose members are taken fron JU. 
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2. Attributes 


Each class has an associated collection of 
attrikutes. Each attribute has the following features. 

1. An attribute name identifies the attribute. An 
attribute name must be unique with respect to the set 
of all attrikute names used in class, the class's 
underlying base class, and all eventual subclasses of 
that base class. 

2. The attribute has a value which is either an entity 
in the database or a collection of such entities. The 
value of an attribute is selected from its underlying 
value class, which contains the permissible values of 
the attribute. 

3. The attribute is either a member attribute which 
applies to each member of the class, and so has a 
value for each member, or aclass attribute which 
applies to a classes a whole, and has only one value 
for the class. 

4. The attribute is specified as either single valued or 
multivalued. The value of a single-valued attribute 
is a member of the value class of the attribute. The 
value of a multivalued attribute is a subclass of the 
value class. Thus, a multivalued attribute itself 
defines a class, that is, a collection of entities. A 
multivalued member attribute can be specified as 
henoverlaping which means that the values of the 
attribute for two different entities have no entities 
in common; that is, each member of the value class of 
the attribute is used at most once. 

5. An attribute can be specified as mandatory, which 
means that a null value is not allowed for it. 

6. An attribute can be specified as not changeable which 


means that once set to a nonnull value, this value 
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cannot be altered except to correct an error. 


pcintend 
a. Member Attribute Interrelationships 


(1) Inverse. The first way in which a pair 
of member attributes can be related is by means of 
inversion. Member attribute X1o0f class Y1 can be 
specified as the inverse of member X2 of Y2 which means 
that the value of X1 for a member M1 of Y1 consists of 
those members of Y2 whose value of X2is M41. The 
inversion interattribute relationship is specified 
Symmetrically in that both an attribute and its inverse 
contain a descrirtion of the inversion relationship. A 
pair of inverse attributes establish a binary association 
between the members of the classes that the attritutes 
modify. 

(2) Matching. The second way in which a 
member attribute can be related to other informaticn in 
the database is by matching the value of the attribute 
with some memper(s) of a specified class. The value of 
Match attribute A1l for the member M1 of ciass C1 is 


determined as follows. 


1. A member M2 of some class C2 is found that has 41 as 
its value of member attribute A2. 
2- The value of member attribute A3 for M2 is used as 
the value of Al for M1. 
If Al is a multivalued attribute, then it 
is permissible for each member of C1 to match the members of 
C2; in this case, the collection of A3 values is the value 


Sreattribute Al. 


Matching permits’ the specificaticn of 
binary and higher degree associations, While inversion 
permits the binary associations. The combined use of 
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inversion and matching allows an SDM schema to acccmmodate 
relative viewpoints of an association. | 

(3) Derivation. SDM provides the ability to 
define an attribute whose value is calculated frcem cther 
information in the database. Such an attribute is calied 
derived, and the specification of its computation is its 
associated derivation. The following rules are formulated by 
HAMMER and McCLOED, in order to allow the use of derivaticns 
while avoiding the danger of inconsistent attribute 
specifications. 

1. Every attribute may or may not have an inverse; if it 
does, the inverse must be defined consistently with 
the attribute. 

2. Every member attribute Al satisfies one of the 
fcllowing cases: 

1. Alhas exactly one derivation. In this case, the 
value Al is ccmpletely specified by the derivation. 
The inverse of Al, if Dtpexistes? May not have a 
derivation or gatching specification. 

2. Al has exactly one matching specification. In this 
case, the value of Al is completely specified by its 


relationships with an entity to which it is matched. 


The inverse of Al, if it exists, may not have a 
derivation. 
3. Ai has either a matching specification or a 


derivation. In this case, it may be that the inverse 
of Al has a matching specification or a derivation; 


if so, then one of the above two ruies applies. 


De. ADVANTAGES OF SDSB 


1. SDM provides an effective base for accommodating the 
evclution of the content structure and use of a 


database. Relativism, logical redundancy, and derived 


information support this natural evolution of 
schemata. 

SDM supports a basic methodology that can guide the 
Database Administrator (DBA) in the design process by 
Froviding him with a set of natural design templates. 
The DBA can approach the application in guestion with 
the intent of identifying its classes, subclasses, 
and so on. Then he can select representations for 
these constructs. 

It provides a facility for expressing meaning abcut 
the data in the database. During logical database 
design, the designer needs such a facility to avoid 
confusion and to document learning, design decisions, 
and constraints. SDM provides better facilities for 
such documentation than other data nodels. 

it allows data to be described in context. Users see 
data from different perspectives. 

in SDM, constraints on operational data can be 
defined. For example, if a given item is not 
changeable, SI* allows this fact to be stated. 

An SDM schema for a database can serve aS a readabie 
description of its contents, organized in terms that 
a user is likely to be able to comprehend and 


identify. 





Figures 5.2, 5.3, 5.4%, 5.5, S-6eand 5.7 9descri) came. 
logical schema of the inventory database. There are five 
records in the logical schema. IDENTIFICATION record gives 
all the information about a given item in the Air force 
inventory such as national stock number, document which 
provides technical informaticn about the iten, total 
guantity in the inventory, total amount used in the fast, 
maximum authorized quantity to keep in the inventory, who is 
authorized to use, depot in which item is stocked, total 
number of the item used by units, supplier name who supfrlies 
item, and amount purchased in the past. The second record is 
UNIT which provides information about units in which an item 
is used. It has several fields such as unit code, superior 
command, national stcck number of item which is used in the 
unit, guantity on hand, used amount, reguired amount, 
location of unit and subordinate command. The third record 
is the ORDER. This record describes the ordering process of 
the iten. Supplier name, Nsn_no, date, amount and shifment 
type are the member attributes of the ORDER. The fourth 
record is DEPOT _STOCK_LEVEL which provides data about stock 
status of the item. Its fields are Gepo_id, 
Nsn_no_registered, stock_amount, and supplier name. fhe 
SUPPLIER record provides data about suppliers who supply the 
items toc the Air Force. Supplier name, country, city anc 
address are the member attributes of the SUPPLIER. 

In the logical schema of the inventory database all 
Classes and their member attributes are informally defined 
and special remarks are written. The purpose of this process 
is tc present the semantic of the database which will let 


the user easily understand the database. Figure 5.1 shows 
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the general structure of the records and the member 
attrikute interrelaticnships. As mentioned in the previcus 
chapter SDM provides three facilities for defining 
relationships. All three hacil1t res use the SDM 
Characteristic that entities can be contained within 
entities. Derivaticn, inverse, and matcn facilities are 
discussed in Chapter 5. 

In the IDENTIFICATION record there is a derivation 


ketween Tot_used_in_fast and and Sum_of_used_Units. Phas 
means that Tot_used_in_past is derived fron 
Sum_of-used Units by summation as specified. Also there is 


match between past_amcunt_purchased of IDENTIFICATION class 
and amount of ORDER class. This means that when the crder 


occured, the value of this member will move the 
past_amount_purchased of IDENTIFICATION. On the class 
level, a member of IDENTIFICATION is to be matched with a 
member of ORDER. This is physically meaningful as well as 


logically. When the logistic department ordered an item and 
receives this order, this value should be moved to the 
past_amount_purchased in order to keep the correct data. For 
this reason, the member in the IDENTIFICATION ciass must 
match the value in the amount of the ORDER. Otherwise there 
can be an inconsistercy in the database. 

There are three inverse relationships in the logical 
schema. First, between aut_to_use of IDENTIFICATICN class 
and Nsn_no_use of UNIT class, secondly between 
depot_of-registry of IDENTIFICATION class and 
Nsn_no_registered of DEPOT_STOCK_LEVEL class, and third 
ketween sSuperior_comm of UNIT class and subordinate_ccmm of 
UNIT class. The logic is the same for all. The inverse 
facility causes two entities to be contained with each 
other. As [Ref.4] specified, this is physically impossible, 
so this idea may seem a bit strange. Consider the first 
inverse. The attributes of IDENTIFICATION and UNIT are 
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inverses of each other. In IDENTIFICATION, the attribute 
aut_to_use has the value class UNIT and the inverse 
attrikute Nsn_no_use. In the UNIT, the attribute Nsn_no_use 
has the value class IDENTIFICATION and inverse attribute 
aut_to_use. As menticned in the previous chapter, inverses 
are always specified Ly such pairs. In the second inverse, 
depot_of _registry cf IDENTIFICATION has the value class 
DEPOT_STGCK_LEVEL and inverse attribute Nsn_no_registry; 
Nsn_ne_registry of DEPOT_STOCK_LEVEL has the value class 
IDENTIFICATION and inverse attribute depot_of registry. 

It is also possikle in the SDM to define an inverse 
relationship between two attributes which are in the same 
class. This case occurrs in UNIT class. Superior_comm and 
subordinate-comm are the inverse of each other. Both of then 
have the same value class, UNIT. Here, the inverse 
interattribute relationship is Specified symmetrically. 
Supericr_command ccumands' the subordinate_command and 
subordinate_command is commanded by the superior_command. 
Users can describe the data ina manner which fits their 


logical view. 
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Figure 5.1 Interclass Relationships of SDM Design. 
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IDENTIFICATION 


Description : Overall information about a given 
item which is in the Air Force 
inventory 


Member attributes: 


Nsn_no . . . . 
descriptiocn:National stock number of a given 


item. 

value class: NATIONAL STOCK NUMBER 
mandatory 

not chanceable 


Document 


descripticn: Technical Order[TO] for a given 
item. It specifies technical 
information about item(s). 

value class: DOCUMENTATION 

mandatory 


Tot_Qty_on_H#Hand 
description:It specifies quantity which is 
en Me eee eae LOC a given 


: 
item in the Air Force (AF) logistic 
| 
| 
| 
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etc 
value class: QUANTITY_ON_HAND 
mandatory 


Tot_Used_In_ Past 


descripticn: Total amount which is used in the 





ast. 
value class: TOTAL USED TNSPASe 
derivaticn :Sum of used_UNITS 


Max_Auth_Qty_On_Hand 


Description:Maximum number of items that AF 
Fog aaee department authorized to 
hold, not more than this capacity. 

value class: MAX AUTH CAPACITY 

mandatory 


Auth_to_use 


Descripticn:It specifies the unit that are 
authorized to use given iten. 

value class:UNIT 

mandatory 

multivalued 

Inverse >Nsn_No_Use 


Figure 5.2 Identification Entity Class. 
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DN eh CCC SSS 


1s registered. 
value class:DEPOT_ STOCK_LEVEL 
mandatory 

multivalued 


inverse :Nsn_No_Registered 
Supplier Name 


Dee ee ee ey ete name that supplies the 
item(s). 

value class:SUPPLTER NAMES 

mandatory 

multivalued 


Fast_Amount_Purchased 
Description:It specifies an amount that is 
Bue ased in the past. 
value class: PAST AMOUNT PURCHASED 
match :Amount of ORDER 
Sum_of_Used_UNITS 


value class: TOTAL_USED_IN_PAST 


Depot_of_ Registry . ; 
Descripticn:Specifies the depot in which item 
| 

mandatory 


identifier: 


Nsn_No + Locument + Depot_of_registry 


co Mims EE ee amie wae pot eee i Re 


| 
i ee 


| 


Figure 5.3 (cont'd. ).. 
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NIT : 


Se 


Description: All units in the Air Force that are use 
the item which are in the AF inventory. 


2 
member attributes: | 
Unit _Code | 
value class: UNIT 
Mandatory 
not changeable 
Superior_ Cogn | 
descripticn:The unit which has command and 
control of this unit. 
value class:UNIT | 
nandatory ; 
inverse >Subordinate_Comn i 
Nsn_NO_ Use | 
description:National stock number that are used 
in the unit(s). | 
value class: IDENTIFICATIO 
inverse :Auth_to_Use | 
Oty_On_Hand | 
value class:QUANTITY_ON_HAND | 
Used_Amount | 
descripticn: Number of items that are previously 
used in the unit. 
value clase: TOTAL_USED_IN_ PAST 
Req Amount { 
descripticn:Specified number of items are. | 
required in the unit for operational] 
Ledditess. 
value class: REQUIRED _AMOUNT_IN_ UNIT 
Location | 
descripticn: Location of unit in geographical 
coordinate systen. | 
value class:LOCATIONS 
Sukordinate_Conn ! 


value class:UNIT . 
inverse ssuperior Comm 


identifier: 


Unit _Code + Nsn-No_Use 


Figure 5.4 Unit Entity Class. 
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ORDER 


Description:Dependent up on the requests from the 
all ordered items by 
tment of fogistics of AF. 


unit 
Derpar 


member attributes 


Supp_Name 


descripticn:Supplier name(s) that supplies the 


value class: 
mandator 
not changeéeab 


Nsn_No 


descripticn: National stock number of item that 
is ordered to su 
value class: NATIONAL_STOCK_N 


mandatory 
Date 
descripticn: 
value class; 
mandatory 


Amount 


descripticn:ordered amount for a given item. 


value class: 
Shipment_tyre 


value class: 
multivalued 


identifier: 
Nsn_No 


Figure 5.5 
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SE 
SUPPLIER NAMES 
le 


Date of order 
DATES 


ORDERED_AMOUNT 


SHEP Mee 


Order Entity 


S11, 


| 
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Class. 


DEPOT_STOCK_LEVEL 


description: Prcvides information about stock level 
of a given item in the depot. 


member attributes 
Depo_ID 


value clases: DEPOT_ID 
mandatory 
not changeable 


Nsn_No_Register 


descripticn:Different groups of items are 
registered into different defects 
such aS communication items and 
weapon items are stored into 
different depots. This attribute 
specifies registered item into 


soe 
value class: IDENTIFICATION 
mandatory ; 
inverse >Depot_cf_Registrv 
Stock_Amount 
descripticn: Number of items that are currently 
available as stock in the depot. 
value Class:STOCK_STATUS 


Supplier _nane 
value Glass: SUPPLIER_NAME 


identifier: 


Ns_No_Register + Depo_ID 


a gy a ty MR en pty I a, PURE i ely I ig I aA eens 


Figure 5.6 Order Entity Class. 
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SUPFIIER 
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Ai 


description:All suppliers t that are currently Bee 
items to the 


member attributes: 
supp _Nane 
value class:SUPPLIER_NAME 
Ccuntry 
descripticn: | of SO aey. that is/are 
curren cay Supply (1es) item(s). 
value cClass:COUNTR 
mandatory 
multivalued 
Cry 
BOGS Lhe ele eee location as city. 
value class:CIiITIES 
multivalued 
Address 
Baca bc len: NOEes > of supplier that supflies 
GEt. 
value class: ADDRESSES 
identifier 


supp _Nane 
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Figure 5.7 Supplier Entity Class. 
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NATIONAL STOCK_NUMEER 
interclass Connection:subclass of STRING where it has 

13 numbers which are divided into four(4) grours: 
3020" 00- =—OOV=00 72 


DOCUMENTATION 


interclass eu ne cone Subclass of STRING where spéci- 


fied forma 


QUANTITY _ON_HAND 
interclass connection:subclass 
is positive integers. 


TOTAL USED IN PAST 
interclass connection:subclass 
1s positive integers. 


MAX AUTHORIZED CAPACITY 
interclass Connection: subclass 
positive integers. 


AUTHORIZED TO USE 
interclass connection:subclass 
is five (5) characters 


DEFO OF. REGISTRY 
ifterclass connection:subclass 
is five (5) characters 


SUEPLIER NAME 
interclass connection:subclass 
is two(2) characters 


PAST AMOUNT PURCHASED 
interclaSs connection: subclass 
1¢ positive integers. 


UNIT 
aE ie ae connection:subclass 
ied. 


USED _AMOUNT_IN_UNIT 
interclass connection: subclass 
is poSitive integers. 


RECUBS IV ANOUNT SeyeUnEEs 


interclass Connection: subclass 
is positive integers. 


LCCATICN OF UNIT 
eS ae connecticn: subclass 
ie 


Figure 5.8 
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DATES 

interclass 
is : 
month : 
a fal 
day : 
wy 
year 
ere _4 38 

=30 





connection:subclass of STRING where format 


number where =>1 and <=12 
numker where integer and 
are WAS ES integer and 


or =9 
‘a a Kes oe ( ae 2 then 


month=4 or = 


ae eran ve year,month,day. 


ORLTER_AMOUNT 


interclass connection:subclass 
is positive integers. 


SHIEMENT 
interclass 
fied. 


DEFOT_ID 
interclass 
ied. 


STCCK STATUS 
interclass 
fied. 


COUNTRY 
interclass 
fied. 


ADLRESSES 


interclass 
fied. 


Y 
interclass 


connection: subclass 


connection: subclass 
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Figure 5.9 (cont'd). 
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The relational mcdel was introduced to the database 
community by BI Codd (1970). This innovation stressed the 
independence of the relational representation from physical 
computer implementation such as ordering on physical 
devices, indexing, and using physical access paths. The 
model thus formalized the separation of the user view of 
data frcem its eventual implementation; it was the first 
model to doso. In addition, Codd proposed criteria for 
logically structuring relational databases and 
implementation-independent languages to operate on those 
databases. There have been many further developments in its 
theory and application. Relational design procedures have 
also received considerable attention in the last few years. 
P.A. Bernstein (1976) had proposed synthesizing relations 
from functional dependencies, and Fagin's work in 1977 then 


drew attention to the decomposition approach to design. 


A. BASIC STRUCTURE OF THE RELATIONAL MODEL 


Usefulness of the relational model in data analysis can 
ke measured by considering several objectives. To meet the 
first okjective-identify user requirements- the model must 
serve aS a communication medium between the users and the 
computer personnel, giving them an interface that can be 
clearly and unambigucusly understood. The independence of 
this interface from computer implementation is of the utmost 
importance. The relational model uses tables to provide this 
interface. The tabular representation of relations satisfies 
the first objective of data analysis. The second objective, 


the ccenversion to physical implementation, is also satisfied 
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by the relational model. One obvious approach is to directly 
lmplement the relaticnal model on a machine. To do this a 
DBMS that supports the relational model must be available on 
the ccmputer system. A particular set of relations can then 
be directly declared by using the definitional language 
provided by the system. Direct conversion was not feasible 


when the relational model was first proposed by Codd in 


no7t, but today direct conversion from a relational 
specification to fphysical implementation is becoming 
increasingly possible. The third objective deals with the 


following criteria fcr logical data structures: 
1. Each fact should be stored once in the database 
2. The database should be consistent following database 
Oferation 


3. The database shouid be resilient to change. 


The first critericn not only removes storage redundancy 
but also improves database consistency. If the same fact is 
Stored twice, it 1s possible that during execution ofa 
complex operation, cnly one of the copy will be updated. 
The dataktase then becomes inconsistent. In an inconsistent 
datakase, it is possible to get different database outputs 
for the same fact, thus creating a reliability problem. The 
second criterion requires that the database be consistent at 
all times. The third criterion deals with a different 
aspect. It is a conseguence of the environment in which the 
datakase exists. This environment is usually in a state of 
constant change; ccnsequently, the database must be 
continually redesigned to meet continually changing user 


reguirements, 
1. Terminology 


Informally, a database 1s made up of any number of 


relations. A relation is Simply a two-dimensional takle that 


oe) 


has several properties. First, the entries in the tables are 
Single-valued; neither repeating groups nor arrays are 
allowed. Relations are flat files. Second, the entries in 
any cclumn are all of the same kind. Columns of a relation 
are referred to as attributes. Finally, no two rows in the 
table are identical in all attribute values and the order of 
the rowS iS insignificant. Figure 6.1 shows an example of a 


relation. 


- 
Z 











| IDENTIFICATION | 
| | NIIN JFxcH3-wo FRAME-NO|ITEN-NO|-—>attribute| 
| 233 5-00-679-0033 001 | L10 | 05 -->Tuple | 
| 2835-00-682-5360 001 | A10 | 05 -~->Tuple | 
| | oo gee ae ee | ee “poe 
| | 2345-00-680-9876 002 | B77 | 08 }-->Tuple | 
! 

| 

J 


a 


Figure 6.1 A Sample Relation Forn. 





Each row of the relation is known as a tuple. If the 
relation has n columns, then €ach row is referred to aS an 
a-tuple. Also, a relation that has n columns or n 
attributes is said tc be of degree n. Each attribute has a 
domain, which i1isaeset of values that tne attribute can 
have. Fcr example, in figure 6.1 the domain of the item-no 
is all fositive integers less than 100. Sometimes it is 
possible that the domains of two attributes can be the same. 
To differentiate between attributes that have the same 
domain, each 1S a given a unique attribute name. The 


generalized format: 
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RELATICN NAME (attribute name, attribute name,....) 


IDENTIFICATION ( NIIN, FICHE-NO, FRAMNE-NO, eee NOM, 
is called the relation structure. If we add constraints 
on allowable data values to the relation structure, we 


then have a relational. schema [ref. 6]. 
a. Keys of Relations 


The key is the attribute or set of attrikutes 
that uniquely identify tuples ina relation. A relation key 
is formally defined as a set of one or more relation 
attributes concatenated so that the following three 
properties hold for all time and for any instance cf the 


relation: 


1. Unigueness: The set of attributes takes on a unigue 
value in the relation for each tuple. 

2. Nonredundancy: If an attribute is removed frcm the 
set of attrikutes,the remaining attributes do not 
possess the unigue property. 

3. Validity: No attribute in the key may be null. 

It 1S possible for relations to have more than 
one relation key; each key is made up of a different set of 
attributes. The relation key is often cailed the candidate 
key. If a candidate key is the only key of the relation, it 
is generally referred to aS primary key. When an attribute 
in one relation is a key of another relation, the attribute 
is called a foreign key. Foreign keys are important when 
defining constraints across relations. A prime attribute is 
an attribute that is fart of at least one candidate key. A 


nonprime attribute is not part of any candidate key. 


2e Consistency 


The goal of relational design is to choose the 


relations that preserve consistency following database 
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operaticns and that store each fact at most once in the 
database. Relations that do this are said to be in normal 
form. In nonnormal relations, anomalies can arise after 
datakase tuple operation. The three tuple database 


operaticns are as follows: 


1. ADD TUPLE ( relation name, <attribute name>). 
This operation adds a new tuple to a relation. The attribute 
values of the tuple are given as part of the operation. For 


example: 


add tuple (identification, <2835-00-678-4529,001,501,05>) 
would adda new row to the relation in Figure 6.1. An 
add-tuple operation will not be allowed if it duplicates a 
relation key. 

2- DELETE TUFLE (relation name,<attrifute value>). 
This cperation deletes a tuple from a relation. For example: 
delete tuple (IDENTIFICATION, <2335-00-6 79-0033, 001,L10,05>) 
would delete the first row from the IDENTIFICATION relation. 

ae UPDATE TUPLE (relation name,<old attribute 
values>,<new attribute values>). This operation changes the 


tuple in the relation. For example: 


update tuple (IDENTIFICATION ,<2835-00-652-5360,007, 2017 0s ae 
<2835-00-682-5360,002,L11,06>) This would change FICHE-NO, 
FRAME-NO, and ITEM-NO for NIIN value equal 2835-00-682-.55488 
Any update will not te allowed if it duplicates a relation 
key. 

In a normal relational structure no anomalies arise 
after the applicaticn of any one of the three preceding 


operations with any set of attributes values. 
3. Functional Dependency 


Functional dependency [FD] is term =a@erivcd Size 


mathematical theory; it concerns the dependency of values of 
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one attribute or set of attributes on those of ancther 
attribute or set cf attributes. Formally, a set of 
attributes xX is functionally dependent on a set of 
attrikutes Y if a given set of values for each attribute in 
Y determines a unique value for the set of attributes in X. 
The notation Y-->X is often used to denote that xX is 
functionally dependent on Y. Sometimes Y is calied a 
determinant of the FD Y-->X. In the simplest case, bcth X 


and Y are made up of cne attribute as shown in Figure 6.2. 





a ee 7 
| NIIN Sane FICHE-NO | 


—* 
mer 
| 
i 
| 
! 
1 
eee | 


Figure 6.2 Functional Dependency Diagran. 


It is also possible to have two attributes that are 
functionally dependent on each other. hes MUN pOrtTaANnE, tO 
realize that functicnal dependency is a property of the 
information that is represented by relations. That is, 
functional dependency is not determined py the use of 
attrikutes in the relations or by the current contents of a 
relation. 

Given a functional dependency Y-->¥X (where X and Y 
are both sets of attributes), a unigue value for each 
attrikute in X is determined only when the values fcr Y 
attributes are known. However, it is possible that values of 
X can ke uniguely determined by only a subset of the 
attrikutes of Y. The term full functional dependency is used 


to indicate the minimum set of attributes ina determinant 
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Of an Eve Formally aset of attributes X are fully 
functionally dependent on a set of attributes Y if 
1. X is functionally dependent on Y. 
2. X is not functionally dependent on any subset of Y. 
Like functional dependency, full functional 
dependency 1S a preperty of the information that is 
represented by the relation. 


4G“. Normal Forms 


When determining whether a particular relation is in 
normal forn, we should examine the FDs between the 
attrikutes in the relation. In the notation first proposed 


by C. Beeri and co-werkers (1978), the relation is defined 


aS made up of two ccmponents: the attributes and the FDs 
between them. K1= ( {X,Y,Z}, { X-->Y , X-->Z } ) The first 
component of the relations is the attrikutes, and seccnd 


component is the FDs. For examrle, 


IDENTIFICATION = ( {NIIN, FICHE-NO, FRAME-NO, ITEM-NO} p 
{NIIN-->FICHE-NO | NIIN==PPRANE-NO fF Nise ee oe 


The functional dependencies between attributes in a relation 
are obviously important when determining the relation's key. 

There are a number of normal forms as Shown in 
Figure 6.3. Relations are in first normai form {1NF) if all 
domains are simple. In other words all legitimate relations 
are in INF. 

A relation is normalized by replacing the nonsinmrgle 
domains with simple domains. A relation R iS in _ second 
normal form(2NF) if every nonprime attribute of R is fully 
functionally dependent on each relation key. 

A relation R is in third normal form if it has the 
follcwing properties: 


1. The relation Ris in second normal forn, and 
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Figure 6.3 Normal Forms. 
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2. The nonprime attributes are mutuaily independent; 
that is, it has no transitive dependency. 

In other words, a relation R is in third ncecrmal fora 

(SNF) if and only if it isin 2NF and every fronrrine 

attribute is nontransitively dependent on the primary key. 


For example, suppose 
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R= ( {A,B,C,D}, {AE—>C, C-->D} ) AB is primary key. 
AB-->C C-->D by transitivity AB-->D hence relation Rk is 
not in 3NF, because, there isa transitivity between 
nonprime attributes. 

In the definition of the third normal form we 
assumed that the relation had only one relation key. 
Problems arise with the definition when applied to relaticns 
that have more than one relation key. The original 
definiticn of 3NF was modified by a stronger definiticn 
which was proposed by Boyce and Codd. It 1S known as BCNF. A 
relation R is in BCNF (Boyce/Codd Normal Form) if and only 


if every determinant is candidate key. For example, surrecse 


R= ( {A,B,D,E} , {A-->BED , D-->A} ) Here relation R will 
be in BCNF if both A and Dare keys of R. Fcrmaily, 
multivalued dependency is defined as follows; in relation 
R(X, Y¥,Z), X === > Y if each X value iS asSociated with a set 
of Y values in a way that does not dependent on Z values. 

A relation i¢ in 4NF if it 1S in BCNF and hkas no 
multivalued dependencies. This definition means that ifa 
relation has multivalued dependency and 1S in 4NF, then the 
multivalued dependencies have a single value. In others 
words, all independent attributes have single value. 

A relation is in SNF if and only if every join 
dependency in a relation R is implied by the candidate keys 
of relation R. 

A relation 1S in Domain-Key normal form (DK/NF) if 
every constraint on the relation is a logical consequence of 
the definition of the keys and domains. A constraint is any 
rule on the static value of the attributes that is precise 
enough that we can evaluate whether or not it is true. 
Fxamples of the constraints are inter-relation constraints, 
functional dependencies, multivalued dependencies, and join 


dependencies. DK/NF means that if we can define keys and 
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domains in such a way that all constraints will be 
satisfied, then mcdification anomalies are impossible. 
Unfortunately, there is no known way to convert a relation 
to DK/NF automatically, nor it is even known which relations 
can be converted to [TK/NF. In spite of this, DK/NF can be 


exceedingly useful for practical database design. 


Be. ADVANTAGES AND DISADVANTAGES OF RELATIONAL MODELS 
1, Advantages 
a. Simplicity 


The end user is presented with a Simple data 
model. User requests are formulated in terms of the 
information content and do not reflect any complexities due 
to system-oriented aspects. A relational data model is what 
the user sees, it is not necessarily what will be 
implemented physicailyj Ref.11]. 


b. Nonprocedural Request 


Because there is no poSitional dependency 
ketween the relations, requests do not have to reflect any 


preferred structure and therefore can be nonprocedural. 
c. Data Inderendence 


This should be one of the major objectives of 
any datatase management system. The relational data model 
removes the details of the storage structure and access 
strategy from the user interface. The relational model 
provides a relatively higher degree of data independence 
than do network and hierarchical models. However, tne design 
of the relations must be complete and accurate for making 


use of this property of the relational model. 
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d. Theoretical Foundation 


The relational data model is based on -the 
well-developed mathematical theory of relations. The 
Tigorous method of designing a database using normalization 
gives this model a solid foundation. This kind of foundation 


does not exist for the other two nodels. 
2- Disadvantages 


A disadvantage sometimes cited for a relational 
model is machine performance. With present-day hardware the 
JOIN operation is likely to take substantial machine tine. 
It is feasible with small relations, but some commercial 
files are hundreds of millions of bytes OT). In 
understanding the performance issue, it 1S very important to 
remember that the relations and the operations on them such 
as the JOIN will never occur physically. Instead, equivalent 
results will be produced by means of pointer structures or 
indices. It appears tcday that technological improvements in 
providing faster and more reliabie hardware may solve this 


problen. 
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VII. RELATIONAL DATABASE DESIGN 





The relational model is attractive in the database 
design because it frovides formal criteria for logical 
structure, namely, normal form relations. The problieg, then, 
is tc choose a design procedure to produce normal forn 
relations. Two different approaches have been proposed: 

1. Decomposition frocedures: These commence with a set 
cf one or more relations and decompose nenncrnmal 
relations in this set into normal forms. 

2. Synthesis procedures: These commence with a_ set of 
functional dependencies and use them to construct 
ncermal form relations. 

Most designs ccmmence with an information gathering 
phase in which a set of data elements and FDs between them 
are identified. The information is then used to produce 
normal relations. On the other hand, one could conceive of a 
procedure where all the data attributes are considered to 
form cne relation, which is then decomposed in suksequent 


design steps. 


A- RELATIONAL DESIGN CRITERIA 


Beeri and co-workers (1978) have identified three 
relational design criteria: 
1. SEPARATION: The original specifications are separated 
into relations that satisfy certain conditions. 
2. REPRESENTATION: The final structure must correctly 
represent the original specifications. 
3. REDUNDANCY: the final structure must not contain any 


redundant information. 
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The separation criteria is that the database must be 
separated into a number of normal relations. The other two 
criteria are relatively general. In specific terms each can 


be applied to attributes, FDs, or data. Here, criteria will 


be defined more specifically. For example, given the 
relation R = ({A,B,C} , {A --> B, A --> C}). 
Here R comprises three attributes,A,B, and C. The 


functional dependency between these attributes are A-->E and 
A-->C. The notation used to describe the input and output of 
the design process is Sin and Sout. Sin and Sout are sets of 
relaticns. Here Sin is the input to the design process and 


Sout is the output of the design process. 
1. Representaticn Criteria 


One goal of any design process is to produce an 
output design, Sout, to accurately represent Sin. All the 
relations in Sout must satisfy the conditions for normal 
EOE Ny. Reeri and co-workers (1978) have defined three 
representation criteria for the representation of Sin by 
DOUG: | 

1. REP1; The reiations Sout contain the same attritutes 
as Sin. 

2. REP2: The relations Sout contain the same attriltutes 
and the same FDs as Sin. 

Sa Shire The relations in Sout contain the same 


attributes and the same data as Sin. 


REP1 requires all the attributes in Sin tc also 
appear in the relaticns in Sout. But it does not consider 
any dependencies between the attributes. According to REP2 
Sin will contain a set of attributes and a set of functional 
cependencies. Sout will also contain a set of attributes and 
a set of FDS. Representation REP2 requires that each FD in 
Sin be either: 


1. Centained as an FD in one of the relations in Sout or 
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2. Derived from the FDs in the relations in Sout, using 
the FD inference rules. For example in Figure 7.1, 
Sin = ({A,B,C} , { A —> B, C --> B })° and 
Sout = (R2,R3) where R2= ( {A,B}, {A-->B}) and 
R3= ( {B,C}, {C-->B}). 

Thus R2 and R3 constitute the decomposition by 
projection of Sin. Fach of the functicnal dependencies in 
Sin is ccntained in Sout; hence we can say that Sout isa 
REP2 representation of Sin. It is interesting that Figure 
7.2 shows a decomposition that is not a REP2 representation 
of Sinf Ref.10]. 

Figure 7.1 includes a relation R1 that is decomrosed 
ry projection into twe relations, R2 and R3, in Sout. Note 
that R2 and R3 do not contain the same information as Sin 
Since different respcnses are obtained to the same question 
applied to Sin and Sout. Hence Sout is not an REP3 
representation of Sin. Because if we ask: To what c is al 
related? In Sin the answer is {ci}; in Sout the answer is 
{c1,c2}. This join in Figure 7.1, contains additional tuples 
to those of Sin and is sometimes known as a CROSS JCIN. Note 
that in Figure 7.2 the two relations Y1 and Y2 in Sin are an 
REP3 representation of Sin because their join contains 


exactly the same tuple as in the original relation, R. 


2. lLlossiess Deccmposition 


Formally, a lossless decomposition can be descrited 
as follows. The deccrposition of a relation R(X,Y,Z) into 
relations R1 and R2 is defined by two projections: R1 = 
projection of R over X,Y and R2 = projection of R over X,Z 
where X is the set of ccmmon attribures in Ri and R2. The 
decomposition is lossless if R = join of Kk1 and R2 over X. 
The decomposition is lossly if Ris a subset of the Join of 
R1 and RZ over X. 
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Figure 7.1 
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Figure 7.2 
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CCNDITIONS: The decomposition of R(X,Y,Z) in R1 (X,Y) 
and R2(X,Z) is lossless if for attribute X, common to both 
Ri aide, either X-->Y or X-->Z. Thus in figure 7.1 tae 
common attribute of K2 and R3is B, but either B-->A or 
B-->C is true, hence decomposition is lossly. In Figure 7.2 
the ccmmon attribute of Y1 and Y2 is X, both X-->Y and X-->Z 


is true, hence decomposition is lossless. 
3. Redundancy Criteria 


Redundancy can be defined in various ways. One set 
of redundancy criteria is as follows [Ref.7]}: 

Ver GREDI: A relation in Sout is redundant if its 
attributes are contained in the other relations in 
Scuec. 

2. RED2 ; A relation in Sout is redundant if its FDs are 
the same or can be derived from the FDs in the cther 
relations in Scut. 

3. RED3 : A relation in Sout is redundant if its content 


can be derived from the contents of other relaticns 


in Sout. 
RED1 is not a powerful criterion, because during 
separaticn it 1s cften necessary to create separate 


relations that represent FDs tetween attributes, which nay 
appear in other relations. RED2 and RED3 can be quite useful 
criteria. Any design algorithms should in particular avoid 
KED3 because it would keep the same data in more than cone 
relation. Such relations could ali be in normal form and no 
anomalies would occur in relations. Bit, interrelation 
anomalies would arise if the same fact were updated in one 
relation but not tke other. RED2 would cause the same 


proklen. 
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Be. RELATICNAL DESIGN PROCEDURE 


It is interesting to note that in Figure 7.1 the design 
Sout iS on REP2 but not on REP3 representation of Sin 
whereas in Figure 7.2 the design Sout is on REP3 but not on 
REP2Z representation of Sin. This situation creates prcklems 
of relational research; namely, to find a deSign procedure 
that yields an Sout that is both on REP2 and REP3 
representation of Sin. Similarly, design procedures should 
aim to reduce redundancy, but here again different design 
procedures can result in either RED2 or RED3 representations 
Geeoan f kef.8]. 

There are two classes of algorithms: decompositicn and 
synthesis. Decomposition aigorithms ccmmence with one 
relation and successively decompose it into normal form 
relations. The concerts of 3NF and BCNF are not sufficient 
for deccmposition algorithms, so the ideas of multivalued 
dependency and a 4NF have to be introduced. 

Synthesis algorithms use FDs to produce normal f£crn 
relations. For these algorithms to be successful it 1s 
necessary to ensure that: 

1. FDS in Sin correctly represent user semantics, 
2. Algorithms can be devised to produce relations in 
Sout that correctly and nonredundantly represent Sin. 
if synthesis algorithms are to be effective, their input 
must descrife these nenfunctional relationships that cannot 
ke expressed as FDs between attributes. Perhaps’ the 
best-known synthesis algorithm is the one devised by 
Bernstein. It is premised on grouping all FDS with the same 


determinant and constructing a relation for each such group. 
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C. PHYSICAL DESIGN OF INVENTORY DATABASE 


Lars] 


1. Mapping from SDM int 


to Relational Model 


The logical design of the inventory database cannot 
be used aS the physical design of a relational database. for 
example, in the IDENTIFICATION and UNIT records, there are 
some multivalued attributes which are not allowed in a 
relation. The relations must be transformed so that each 
attrikute has only one value per tuple. Also, the logical 
design in Chapter 5 allows tuples to be contained in cther 
tuples which cannot Ee done physically. Relations in the 
logical design have to be redefined to eliminate this 
problem. 

Consider the relations UNIT and IDENTIFICATICN. 
Actually, Auth_to_use of IDENTIFICATION is a collectionma: 
tuples representing UNIT which are using a specified item. 
We can eliminate Auth_to_use of IDENTIFICATION, kecause, 
whenever we need this information we can get it by use of 
the Data Manipulation Languaye (DML). It is possible to 

construct contained tuples by DML joins. In this case, 
Auth_to_use will be constructed and not stored. 

The process iust described can be used to transifcrnm 
the logical schema into a reiational schema. All contained 
tuples have been replaced uSing the same logic. Auth_tc_use 
of IDENTIFICATION is deleted and interrelation constraints 
are added. Similarly, Depot _of registry of IDENTIFICATION 
and Subordinate_comm of UNIT and Past_amount_purchased of 
IDENTIFICATION are deleted and interrelation constraints are 
added. 

The resulting design is shown in Figure 7.3 and 7.4. 
Figure 7.3 shows relation, attributes, and interrelation 
constraints, and Figure 7.4 shows the domains and 


attribute-domain correspondences. 
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ime LCN (Nso No,Tot_Qty-On_dHand, 
Sum_of _used_unit, Max. Auth _Qty_Hand) 


KEY : Nsn_no 


DOCUMENT IDENTIFICATION (Nsn_No, Document, Supp_name) 
KEY : Nsn_No 


UNIT_INVENTORY (Unit_Code, Nsn_No_Use, Qty_On_dHand, 
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Figure 7.3 Records of Relational Schena. 
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i 
wacett eel bute ee Domain _____. 
“Nsn_No NATIONAL STOCK NUMBER 
Decument DOCUMENTATION | 
Tot_gty_on_hand TOTAL QUANTITY | 
Max_auth_gty_hand MAXIMUM QUANTITY | 
| Sum_of_used_units SUB MC EU SUD : 
Supp_namne S_NAME | 
Unit_code UNIT NAME 
Superior conn UNIT NAME | 
Nsn_no_use NATIONAL STOCK NUMEER | 
Qty_on_hand TOTAL QUANTITY ) 
Used_amouat SUM OF USED 
Reg_amount REQUIRED AMOUNT : 
| Location LOCATIONS | 
Date DATES | 
Amount ORDER_AMOUNT 
| Ship_type SHIPMENT TYPE | 
Depo_id D_ NAME | 
| Nsn_no_ registry NATIONAL STOCK NUHEER | 
Stock_amount TOTAL AMOUNT | 
| Ccuntry C_NAME | 
| Address ADDRESSES \ 
| City CITIES | 
| 
Es iadted, ___ 


Figure 7.4 — Attributes and Domains. 
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ATABASE MANAGEMEN 


VIII. SYSTEM R: RELATIONA 





System Ris a database management system which provides 
a high level relational data interface. The system provides 
a high level of data independence by isolating the end-user 
as much as possible from underlying storage structures. The 
system permits definition of a varity of relational views on 
common underlying data. Data control features are provided, 
including authorization, integrity assertions, triggered 
transactions, a legging and recovery subsysten, and 
facilities for Maintaining data consistency a 0 a 
Shared-urdate environment. System R supports a relational 
database,i.e., a database in which all data is perceived by 
users in the form of tables. All access to this database is 


via a data sublanguage called SEQUEL. 


A. ARCHITECTURE AND SYSTEM STEUCTURE 


Figure 8.1 gives a functional view of the systen 
including its major components and interfaces. The 
Relational Storage Interface (RSI) is an interface which 


handles access to single tuples of base relations. 

This interface and its supporting system, the Relational 
Storage System (RSS), 1s actually a complete storage system 
in that it manages’ devices, Space allocation, stcrage 
buffer, transaction consistency and locking, deadlock 
detection, backout, transaction recovery, and systen 
LTeCOVEry. Also it maintains indices on selected fields of 
kase relations and pcinter chains across relations. 

The Relational Data Interface (RDI) is the external 
interface which can be called directly from a programming 


language, or used tc support various emulators and other 
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Figure 8.1 Architecture of System R. 


interfaces. The Relational Data System (RDS), which supforts 
the RDI, provides authorization, integrity enforcement, and 
support for alternative views of data. The high level SEQUEL 
language is empedded within the RDI and is used as the ftasis 
for all data definition and manipulation. In addition, the 
RDS maintains the catalogs of external names, since the FSS 
uses only system generated internal names. The RDS contains 
an optimizer which chooses an appropriate access’ path for 
any given request frcm among the paths supported by the &SS. 
kKSS and REDS will be evaluated in detail the following next 


two sections. 


Be. THE RELATIONAL DATA SYSTEM 


The Relational Data Interface (RDI) is the principal 
external interface of Systen Rez The data definition 
facilities of the EDI allow a variety of alternative 
relational views to re defined on common underlying data. 
The RDS is the subsystem which igplements the RDI. The RDI 
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consists of a set of operators which may be called from PL/I 
or other hest programming languages. All facilities of the 
SEQUEL data sublanguage are available at the RDI by means of 
the RDI-called SEQUEL. SEQUEL is deSigned to be used both as 
a stand-alone language for interactive users and as a data 
sublancuage embedded ina host programming language such as 
PL/1. In the latter case the SEQUEL statements in the 
program are identified by a precompiler which replaces then 
with valid PL/I calls to a run-time module which frovides 
the environment for executing an application program that 
has been through the precompilation process. The 
precompilation process is described below Ref.6]}. 

1. The precompiler scans the source program and locates 
the embedded SEQUEL statements. 

2. For each statement it finds, the precompiler decides 
on a strategy for implementing that statement in 
terms of RSI crerations. Having made its decisions, 
the precompiler generates machine language routines 
(including calls to the RSS) that will implement the 
chosen strategy. The set of all such routines 
together constitutes the access module for the civen 
scurce program. The access module is itself stored in 
the database. 

3. The precompiier repiaces each of the origiral 
embedded SEQUEL statements by an ordinary PL/I 
statement to the run-time module of the RDS. 

The modified source program can now be compiled by the 
PL/I compiler in the normal way. This process is depricted in 
Brgure 8.2. 

In terms of query facilities, SEQUEL provides extensive 
guery facilities based on English key words. AS a 
relational DBMS we have ORACLE in our school. In terms of 
Query facilities there is no big difference between Syster R 


and ORACLE. Query, data manipulation, and data definition 
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Figure 8.2 Precompilation Process. 
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facilities of ORACLE will be illustrated over the Inventory 


Catakase ry a series cf examples in Chapter 9. 


1. VYVata Definiticn Facilities 
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The primary data structure in System R is the Base 
Relation (Base Table). The base relation is a tacle that 
has its cwn independent existence and is represented in the 
physical database by a stored file. Base table can be 
created at any time Fy executing the SEQUEL DDL statement 
CREATE TABLE, which takes the general forn: 

CREATE TABLE btase-table-nanme 
(field-detinition , s<2.. 2 
{ IN SEGMENT segment-name } 
where a field-definition, in turn, takes the forno 
field-name ( data-type {, NONULL } ) 
Successful execution cf the CREATE TABLE statement causes a 


new, empty base table to be created in the specific segment 
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with the specific base-table-name and specific field 
definiticn. The user may now proceed to enter data into that 
table using the SEQUEL INSERT statement. A System R database 
is partitioned into a set of disjoint SEGMENTS which 
provides a mechaniszn for controlling the allocaticn of 
storage and the sharing of the data among users. Any given 
base table is wholly contained within a Single segment and 
indices on that base table are also contained in that same 
segment. However, a given segment may contain several base 
tables and their indices. A public segment contains shared 
data that can be simultaneously accessed by multiple users. 
A private segment contains data that can be used by only one 
user at a time. If the CREATE TABLE statement does not 
specify the segment, then the base table will goin a 
private segment belonging to the user that issued the CREATE 
TABLE. This specification is an option in the CREATE TA3LE 
statement. Each field definition in CREATE TABLE includes 
three items; A field-name, a data-type for the field, and 
optionally a NONULL specification. The field name has to be 
unigue within the base table. The System R suppcrts the 
concept cf nonull field values. Null is a special value that 
is used to represent "value unknown" OL "value 
inapplicable". 

By using the EXPAND TABLE statement, an existing 
base takle can be expanded at any time by adding a new 
column at the right : 

EXPAND TABLE Lase-table-nanme 
ADD FIELD filed-name ( data-type ) 
The igportant point is that the specificaticn NONULL is not 
permitted in EXPAND TABLE. It is also possible to destroy 
an existing base table at any time: 
DROP TABLE base-tabdle-nanme 
All records in the specific base table are deleted, all 


indexes and views on that table are destroyed, and the table 
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itself is then also destroyed; that is, its description is 
removed from the dictionary and its storage space is 
released[Ref. 7]. | 

The guery power of SEQUEL may be used to define a 


- 


view asa relation derived from one or more other base 
tables. This view may then be used in the same ways asa 
base table: queries may be written against it, other views 
may ke defined on it, and in certain circumstances descrited 
below, it may be updated. Any SEQUEL guery may be used as a 
view definition by means of a DEFINE VIEW statement: 
DEFINE VIEW view-name 

{_( £ield—-name 7.22... . 

AS SELECT - statement 
Views are dynamic windcws on the database as shown in Figure 
8.3. In System R, a view that 1S to accept updates must be 
derived from a single base table. Moreover, it must satisfy 
the fclicwing constraints: 

1. Each distinct row of the view must correspond toa 
distinct and uniguely identifiable row of the base 
table. 

2. Each distinct column of the view must correspond to a 
distinct and uniguely identifiable column of the base 
table. If a view does satisfy constraints 1 and 2, 
then any update against it can easily be mapped into 


an update on the corresponding base table. 


There is another SEQUEL command for data definition 
facility: KEEP TABLE. It causes a temporary table to beccme 
permanent. Normally, temporary tables are destroyed when the 


user who created them logs off. 
2- Data Control Facilities 


System R has extensive data control facilities that 


enable users to control access to their data by other users, 
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and to exercise control over the integrity of data values. 
The data control facilities have four aspects: transactions, 
authorization, integrity assertions, and triggers. 

A transaction iS a series of the statements which 
the user wishes to te processed as an atomic act. The 
meaninc of the "atomic" depends on the level of ccnsistency 
specified by the user. The user controls transactions by the 
operator BEGIN-TRANS and END-TRANS. The usec may specify 
Save points within a transaction by the operator SAVE. As 


long as a transaction is active, the user may block up to 
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the begining of the transaction or to any internal space 
point by the operator RESTORE. 

System R allows for an extremely simple method of 
authorization checking. System R maintains two takles for 
the use of the authorization subsysten: SYSAUTH and 
SYSCOIAUTH. The SYSAUTH table has up to two rows for each 
combination of relaticn (base or view) and user. The columns 
in the SYSAUTH table correspond to user ID, base relation or 
view name, type (base or view), a column for each cf the 
privileges on the relation (Y OR NWN) anda column for grant 
cpticn (Y or WN). For each relation on which aouser is 
authorized to perform some action, there are uf to two 
tuples in SYSAUTH: one for grantable and the other for 
non-grantable privileges. In case the user has ufdate rights 
on a relation, the table SYSCOLAUTH indicates precisely 
those columns of the relation on which the user has’ the 
update privilege. Tkese two tables, SYSAUTH and SYSCOLAUAT, 
are updated whenever a new base relation or view is created 
or an authorized user executes a GRANT statement, thereby 
granting a set of privileges to one or more other users. The 
two tables are referenced immediately before the execution 
of any SEQUEL statement[ Ref. 5]. 

The third impertant aspect of data control is that 
of integrity assertions. Any SEQUEL logical expression 
associated with a base table or view may pe stated as an 
integrity assertion. At the time an assertion is made Ey an 
ASSERT statement, its truth is checked; if true, the 
assertion is enforced until it is explicitly dropped bya 
DROP ASSERTION statenrent. Any data modification by any user 
which viclates an active integrity assertion is rejected. 
Assertions may apply to individual tuples or to sets of 
tuples. 

The fourth aspect of data control, triggers, isa 


generalization of the concept of assertion. A trigger causes 
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a prespecified sequence of SEQUEL statements to be executed 
whenever some triggering events occurs. The triggering event 
may be retrieval, insertion, deletion, or update of a 
particular base table or view. RDI can monitor such events 
by simply scanning a transacticn for a SEQUEL statement that 
corresponds to a particular triggering event. After each of 
these statements, immediately a call statement is included 


to invoke the appropriate trigger routine. 
3- Data Manipulation Statements 


The RDI facilities for insertion, deletion, and 
update tuples are also provided via the SEQUEL data 
sublanguage. SEQUEL operates on both base tables and views. 
It can be used to manipulate either one tuple at time or a 
set of tuples with a single command. By uSing these 
facilities, it is possible to assign the result cf a guery 
to newly created relation. 

An insertion statement in SEQUEL may provide only 
some of the values for the new tuple, specifying the names 
of the field which are provided. Fields which are not 
provided are set to the null value. The physical positica of 
the new tuple in storage is influenced by the "clustering" 
specification made on associated RSS access paths. 

Teletion is done by means of a DELETE statement 
accomfanied by a WHERE clause. The WHERE clause specifies 
the conditions that must be satisfied by the records to be 
deleted. The RDI can translate the UPDATE statements in cne 
of twe ways: 

1. By using the RETRIEVE command to determine the 
addresses of the selected records, and then using the 
REPLACE command to modify these records one at a 
time. 

2. By using the REPLACE command to modify all the 


selected records simultaneously. 
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Which of these two methods is to be used depends on 
the actual SEQUEL statement. If the SET clause makes 
identical changes to all the selected tuples, then only the 
second method should be used. The SEQUEL assignment 
statement allows the result of a query to be copied into a 
new permanent or temporary relation in the database. This 
has the same effect as a guery followed by the RDI cperator 
KEEP. The execution of an assignment statement by the RII is 
done in two parts: 

1. The records satisfying the query are retrieved, 

2. A new relation is created with the records retrieved 
an 31) These records are then stored in the 
database. 


A series of examples will be given for inventcry 


database by uSing ORACLE relational DBMS in Chapter 9. 
4. Cptimizer 


The objective of the optimizer is to find a lew cest 
means of executing a SEQUEL statement, given the data 
structures and access paths available. The optimizer 
attempts to minimize the expected number of pages to be 
fetched from the secondary storage into the RSS tuffers 
during execution of the statement. Only page fetches made 
under the explicit ccntrol of the RSS are considered. If 
necessary, the RSS kuffers will be pinned in real memory to 
avoid additional paging activity caused by the operating 
system such as the VM/370 operating system. The cost of the 
CPU instructions 1S also taken into account by means of an 
adjustable coefficient which is multiplied by the number of 
tuple ccmparison operations tc convert to eguivalent page 
accesses. The adjustable coefficient can be adjusted 
according to whether the system is computation-bound or I/0 
bound[ Ref. 6]. 
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After analyzing any SEQUEL statement, the optimizer 
produces an Optimized Package (OP) containing the parse tree 
and a plan for executing the statement. If the statement is 
a query, OP is used to materialize tuples as they are called 
for by the fetch ccmmand. If the statement isa view 
definition, the OP is stored in the form of a Pre-Optimized 
Fackage (POP) which can be fetched and utilized whenever an 
access 1S made via the specified view. If any change is made 
to the structure of the base table or to the access faths 
Maintained on it, the POPs of all views defined on that base 
table are invalidated, and each view must be reoptimized 


from its defining SECUEL code to form a new POP. 


C. THE RELATIONAL STORAGE SYSTEM 


The RSS is essentially a powerful access method. Its 
primary function is to handle all details of the physical 
level and to present its user with an interface called the 
RSI. The user of the RSS is necrmally not a direct user, but 
1s code generated by the RDS in compiling some SEQUEL 
statement. The RSI was specifically designed to be a good 
target for the SEQUEL compiler. 

As shown in Figure 8.3, the basic data object at the RSI 
is the stored file which is the internal representation of a 
pase table. Rows of the table are represented by records of 
the file; the stored records within one stored file need not 
be physically adjacent in storage. An arbitrary number of 
indexes over any given stored file is supported by the PSS, 
thus providing the additional access paths to tnat file. The 
RSS objects (stored files,indexes, etc.) and the associated 
Operators together constitute the Research Storage 
Interface (RSI). As rentioned above it is the interface used 
as the target by the RDS in precompiling SEQUEL requests, 


The user of the RSI needs to know what stored files and 
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andexes exist, and must specify the access path(index or 


system sequence) to be used in any given RSI access request. 


1. Segments 


In the RSS, all data is stored in a collection cf 
the logical address space called SEGMENTS, which are 
employed to control physical clustering. Segments are used 
for storing user data, access path structures, internal 
catalog information, and intermediate results generated bv 
the RDS. All the tuples of any relation must reside within a 
Single segment chosen by the RDS, but a given segment may 
contain severai relations, Three types of segment are 
supported, each with its own combination of functicns and 
overhead: shared (or public) , private, and temporary data 
segments. Basically data in shared segments are recoveratle 
and sharable; data in private segments are recoverable but 
not sharable; and data in temporary segments is néither 
recoverable nor sharable. Segment type is fixed at the time 
of the system installation and cannot be changed. Each 
segment consists of a sejuence of equal-sized pages which 
are referenced and fcrmatted by various components of the 
RSS. The RSS maintains a page map for each segment which is 
used to map each segment page to its iocation on disk. At 
HS ie, segnents are identified by a numeric segment 
identifier. Pages are identified by page number within 


segment. Pages are never directly referenced in SEQUEL. 


2. Files and Records 





Each base takle is represented as a stored file. A 
stored file is identified at the RSI by a numeric identifier 
called as RID. In cther words, a RID identifies a stored 
file. The RDS is resfonsible for mapping SEQUEL table-names 
to RDIs. Records in the stored file represent rows cf the 


table. Each record is stored as byte string. The byte string 


8&8 


Consists of a prefix, (containing control information, such 
aS the RID of the containing file), followed by the stored 
representation of each field in the record. pike segments 
and files , individual tuples have their own numeric 
identifier, called a TID. The TID for a tuple consists of 
two parts: page number of the page containing tuple, anda 
byte offset from the bottom of the page identifying a slot 
that contains, in turn, the byte offset of the tuple fron 
top of the page. Operators are available to INSERT and 
CELETE single tuples, and Neon FETCH ad UPDATE any 


combination of fields in a tuple. 
3. Images and Links 


An image in the RSS is a logical reordering cf£ an 
n-ary relation with respect to valueS in one or more sort 
fields. Images combined with scans provide the ability to 
scan relations along a value ordering for low level support 
of simple views. An image provides associative access 
capability. The RDS can rapidly fetch a tuple from an image 
by keying on the sort field values. A new image can be 
defined at any time on any combination of fields in a 
relation. Each of tke fields may be specified as ascending 
or descending. An image can also be dropped at any time. 
The RSS maintains each image through the use of multifages 
index structure. A new page can be added to an index when 
needed as iong as one of the pages within the segment is 
marked as available. The pages for a given index are 
organized into a balanced hierarchic structure. Each page 
is anode within the hierarchy and contains an ordered 
seguence of index entries. 

A link in the RSS iS an access path which is used to 
connect tuples in one or more relations. The RDS determines 
which tuples will re on the link and determines their 


relative position by using explicitly the CONNECT and 


ope, 


DISCCNNECT operations. The RSS maintains internal fointers 
so that newly connected tuples are linked to previous and 
next twins, and previous and next twins are linked to each 


other when a tuple is disconnected. 
4%. Transaction Management 


A transaction at the RSS is a sequence of RSI calls 
in behalf of one user. In general, an R&SS transaction 
consists of those calls generated by the RDS to execute all 
RDI operators in a single System R transaction, including 
the calls reguired tc perform such RDS internal functicns as 
authorization, catalcg access, and integrity checking. An 
RSS transaction is marked by the START-TRANS and END-TRANS 
operators. A transaction save point is marked as the 
SAVE-TRANS operator, which returns a save point number of 
subseguent reference. In general, a save point may be 
generated by any of the layers above the RSS. An RDI user 
may mark a save fpoint at a convenient place in this 
transacticn in order to handle backout and retry. The RDS 
may mark a save point for each new set oriented SEQUEL 
expression. Transaction recovery occurs when the RDS or 
Monitor issues the RESTORE-TRANS operator, which has a save 
foint number as its input parameter, or when the RSS 
initiates the procedure to handle deadlock. The transaction 
recovery function is supported through the maintenance of 
the time ordered ~ ists or Sloqmpenteres, which record 
information about each change to recoverable data. MThose 
changes include all the tuple and image modifications caused 
Dy TLNSERT DEL ae. and UPDATE operations andail the link 
modifications caused Ey CONNECT and DISCONNECT operations. 


Ss CONCULECNGY Centro 


Since System fF is a concurrent user system, locking 


techniques must be employed to solve various synchronization 
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probiems, both at the logical level of objects like 
relations and tuples and at the physical level of pages. At 
the lcgical ievel, such classic situations as the "lost 
update" problem must be handled to insure that two 
concurrent transactions do not read the same value and then 
wey tO write back an incremented value. If these 
transactions are not synchronized, .the second update will 
overwrite tne first, and the effect of the increment will be 
lost. At the physical level of pages, locking techniques are 
required to insure tkat internai components of the RSS give 


correct results. 
6. Zocking 


Cne basic decision in establishing System R_ was to 
handle beth logical and physical locking reguirements within 
the RSS, rather than splitting the functions across the RDS 
and RSS subsysten. Physical locking is handled by setting 
and holding locks on one or more pages during the execution 
of a single RSI operation. fogicaly locking 1s “handled” by 
setting locks on such objects as sequence, relations, tuple 
identifiers (TIDs), and key value intervals and holding them 
until they are explicitly released or to the end of the 
transaction. Another basic decision in formulating Systen & 
was to automate all cf the locking functions, both logical 
and physical, so that auser can access shared data and 
delegate some or all lock protocols to the systen. 

In order to provide reasonable performance fcr a 
wide spectrum of user requirements, the RSS supforts 
mMultilevels of consistency which control the isolaticn ofa 
user from the actions of the other concurrent users [Ref.2]}. 
When a transaction is started at the RSI, one of three 
consistency levels must be specified. Different consistency 
levels may be chosen by different concurrent transactions. 


For all of these levels, the RSS guarantees that any data 
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modified by the transaction is not modified by any cther 
until the given transaction ends. The differences in 
consistency levels cccur during read operations. lLevel-1 
consistency offers the least isolation from the other users, 
but causes the lowest overhead and lock attention. Vith this 
level, dirty data may be accessed, and one may read 
different values for the same data item during the same 
transaction. In level-2, the user is assured that every iten 
read is clean. However, no guarantee is made that subsequent 
access to the same item will yield the same values or that 
associative access wili yield the same item. For the highest 
consistency level (which is level-3) the user sees the 
logical equivalent of a single user system. Every item read 
is clean, and subs¢eguent reads yield the same values, 
subject to updates by the given user. Leveli-3 consistency 
eliminates the probler of Lost updates and also guarantees 
that one can read a logically consistent version cf any 
collection of tuples, since other transactions are logicaliy 
serialized with the given one. 

The RSS compcnents set locks automatically in order 
to guarantee the logical functions of these various 
consistency leveis. The KSS employs a single lock mechanisao 
to synchronize access to all objects. This synchronization 
is handied by a set cf procedures in every activation of the 
RSS, which maintains a collection of queue Structures called 
GATES in shared, read write memory. An internal reguest to 
lock on an object has several parameters: object name, lock 
mode, and indication cf lock duration. There are several 
factors which will effect the choice of lock duration such 
as the type of action requested by the user and consistency 
level of the transaction. Data items can be locked at 
various granularities to insure that various applications 
run efficiently. Lock ona single tuple will be effective 


for transactions which access small amounts of data. Locks 


a2 


on entire relations cr even entire segments will be pore 
reasonable for transactions which cause the RDS to access 
darge amcunts of data. For accomplishing these situations, a 
dynamic lock hierarchy protocol has been developed so that a 
snail number of locks can be used to lock both few and many 


objects. 


7. Deadlock 





Since locks are requested dynamically, it is 
possitle fcr two or more concurrent activations of the RSS 
to deadlock. The RSS has been deSigned to check for deadlock 
Situations when requests are blocked and to select cne or 
more victims for backout if deadlock is detected. The 
detection is done by the Monitor ona periodic basis by 
looking for cycles ina user-user matrix. The selection of 
victim is based on the relative ages of transactions in each 
deadlock cycle as well as on the duration of the locks. In 
general the RSS selects the youngest transaction whose lock 
is of short duration, since the partially completed call can 
easily be undone. If none of the locks in the cycle are of 
short duration, the youngest transaction is chosen. This 
transaction is then backout to the save point preceding 
offending lock request, using the transaction recovery 


scheme. 


he 


A. INTRODUCTION 


The CRACLE Relational Database Management System isa 
computer program that manages pieces of data stored ina 
computer. ORACLE allows access to this data by providing 
sets cf commands that tell the computer what to do. These 
commands are ina language that is called SQL. SQL has 
several facilities fcr data manipulation. Some of them will 
ke used for the Inventory Datakase. 

All data in ORACLE are stored as tables. Tables are nade 
up of columns and rows. The SUPPLIER table shown below has 
four cclumns (SUPP_NAFME, COUNTRY, ADDRESS, and SHIP iva 
and four rowS. A Icw is made up o£ fields. Bach field 
contains a data value stored where acolumn and row meet. 
For example, the first row in the SUPPLIER table has the 
data value ITT stored in its SUPP_NAHE field, the data value 
USA stored in its CCUNTRY field, PQO.BOX.9 stored in its 
ADDRESS -ive wae and the data value S.F stored in its CITY 
field. A database can coilitains many tables. ORACLE allows 
the creation of as many tables as needed. All the tables 
stored in ORACLE make up the database. 

We can create a table using the CREATE TABLE coamand. 
The Ccmmand that creates the PART IDENTIFICATION table ism 
follcws: 


UF I> r 
! create table oarteitentifization 
( asnend char(14), 
3 toteqtytonehand nyumner(o), 
4 Maxeautheatyehand numpner(6), 
Ss Sumeofeyusedeunit numder(o) ) 


Table created. 
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In the SREAL & TABLE command we ha me the table 
FART_IDENTIFICATION and the columns of tne table (NSN _-NO, 
TOT QTY_CN_HAND, MAX_AUTH_QTY_HAND, SUM_OF_USED_UNIT). We 
specify if the column is to contain only numeric values 
(NUMBER) or character (both numbercs and letters) values 
(CHAR). We also specify the maximum length of the value 
that can be stored in the: column. For exauple, no NSN_NO can 
Eke longer than 14 characters- nsn_no char(14). 

After a table is created, rowS can he entered into the 
table using the INSERT command . Tne following ccumand is 
used to enter the first row into the PART_IDENTIFICATION 
table. 


UFI> insert into parterdentification 
ce values (?1342"241"4111',15900,20000,30900); 


1 record created, 


on the PNS ERI command we name the table 
PART IDENTIFICATION into which the crow is to be inserted and 
list the data values that go into each colunn. 

In a Similac manner using the CREATE TABLE command, all 
tables in the inventcry database are created and using the 
INSERT command all data are inserted into tables. The final 
versicn of the tables are Shown below. 

PART IDENTIFICATICN 


UFI> select * 
Cc  06from parttidentification; 


NSNeNO TOTeE STY CE ONGHAND WAX AUTHE OT Ye HAND SUMOOFEUISEDEUINIT 
134222412411] 15000 2000N 30000 
24212°311°4111 10000 15000 2000 
2451231224115 5000 10000 90000 
Sitiler1,191511 25000 10000 15000 
2511°511-4511 10000 15000 20090 
1015"5§12-"S11e2 2000 4n00 1500 
7511°632=28332 15000 25000 125000 


7T records selected, 


2 


DCCUMENT_ IDENTIFICATION 


UFI> gelect * — | 
2 ¢rom documenteidentificatiane 


DOC 


SUPP 
itt 
itt 
asal 
asa 
dec 
ion 
jJec 
asal 


NSNeNY 

1342-241e4111 
2421931104155 
2u51-3120<4115 
SitierlielSil 
255912511°5411 
1511221595111 
1015<“S512“Sile 
7511263228332 


tonl 
toml 
tome 
tomi 
toml 
toms 
tome 
toms 


8 records selected. 


UNIT INVENTORY 


UFI> gelect * 


e 


UNITe¢e 


loase 
loase 
2oase 
2oase 
39ase 
35ase 
4oase 
Soase 
69ase 
79ase 
S85ase 
Joase 


NSNeNOeUSE 


1342*2451 24111 
e4et=sil-4d115 
2412-31194115 
2451-3 12"°4115 
C4s1e312-4117 
2511-511<4511 
S11 °2159S5111 
Poa a-Si 
73511°632°5 332 
1015" Sle=s el 
1015"5122450 12 
SEs les) 


from urIiteinventory, 


QTY*©9NeHAND USED] AMOUNT RE TVeAMOUNT 


7500 30000 25900 
2000 10000 3000 
3000 10000 2000 
l230 4000 750 
5900 20000 2000 
6250 7500 3200 
5000 20000 2000 
$00 1000 400 
400 Son 200 
7500 25000 10000 
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UNIT_ID 


UF I> select 
ec 6ftrom uriteid, 


UNI Tec 
loase 
eoase 
39ase 
4nase 
Soase 
Ooase 
7oase 
8oase 
9oase 


SUPERT LOCATION 


etaf 
etaf 
3taf 
Gtaf 
Gtaf 
traf 
ttaf 
etaft 
3taf 


L3N20E 
CSN30E 
CIN32E 
Z6N21E 
YWON22E 
ZIN2SE 
S7N2SE 
ZB8N2SE 
GINCIE 


9 records selected, 


PAET CRDER 


UFI> select 
2 86fl rom partteorder; 


NSN&ND 


P34e-2d1-41t1 


2edSt=312-411S 


stern iti=15i1 
leno e 1 5=5111 
Betieste=Si)} 


e0lo-Ste=Sstie 
2015763278332 


SUPPeN DATES 


7 recor4as selected. 


Us270> 


AMOUNT 


ei) 


Sale 


DEPOT STOCK_LEVEL 


JFIT> select * 
C 6 ffom depastestockelevel: 


DEPD® STOCKFAMOUNT SUPP NSNENDEREGIST 


Jepol 7500" 1t% 134222419411 

deooe 5000 asal 2421231194115 
deno3 2500 jbn 2451231224115 
deood $000 dec 2e511°312°4115 
deooS 1900 jot» 1015*Si2-Stie 
deoob 7500 see 7511-682-8332 
deoo? 11250 ibm 151tte215<Sttl 
deoo8 1250 asal StitetitieSiti 


8 records selected. 


SUPRISE R 


UFI> select * 
eC 6 ffon sSuDdlier;3 


SJPP COUNTR ADDRESS 


CITY 
Vt C usa Ds.50x.9 S.F 
19m" USa P2eDOox.le bea 
dec usa O5.,00x.11 Niet 
asal tur O5-00x./ izmir 
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B. 


SAUeELE QUERTES 


1. 


List national stock numter of the 


guantity on hard egual 10000. 


UFI> select nsneno 
ce from parttidenti fication 
3 where toteatytonenand = 10000; 


NSN@&NY 


C421931124111 
2351125112451) 


Display nsn_no which is in the 


amount greater than 2500. 


UFI> select nsnenoeuse 
2 from urtiteinventory 
3 0 6ahere unitecode = ‘'Inase' 
G4 ana reaqeanount > 2500; 


NSNENDE USE 


242131194115 


28) 


Teens oC 


Tbase and 


wilen «He 


re juired 


3. List all bases which are under command of 2taf£. 


UFI> select unitecode 
oe 0 6ffom urxiteia 
3 where Suderiortcomr = ‘Ctaf' 3 


UNTTeC 


loase 
erase 
82aSse 


UY. List all locations o€£ tases 


Si Wtaks 

UFI> select UNIte&Code, location 
2 from uriteig 
3 


where suseriore 4 
OPEC Onn = = leat ts 


UNITeC LOCATION 


Gdase 31N2SE 
T2ase 37N2Q5E 


100 


which are 


under ccrmand 


List all suppliers names and tneir addresses in the 


USA. 


UFI> select Supoenanes,address,city 
© fron supslier 
3 where country = ‘usat 3 


SJPP ADDRESS City 

itt 9s90.b9x.9 S.F a 
i9m 30.b0x.12 ee 

Gdec 930.b5x.11 NY 


Find total guantity on hand, document, and supplier 
Name for items for which the sum of used amount 1s 


greater than 1¢000. 


I> select tateatyronehand, Jacuments,surs0enane 

2 from oarteidenrificacionsdocumenteitentification 

3 where Suneofeuse jeunit > 1900 

4 and sarteitentification.risneno = docunenteidentification.ensneno: 


TITeQTYeUNeEAAND DUCYU SUPP 


13000 tomt itt 
3000 tome asal 
2000 tom2 tec 
15090 tom3 asal 
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Find total used amount for 1base. 


UFI> select sum(usedeamnunt ) 


e from uriteinvenrntory 


306 6Uwhere unitecode = "Ibnase' ; 
SUMCISEDeY AMOUNT) 
40000 


Find nsn_no and total quantity oon hand 


order by tot_gty_on_hand for items 


maximum authorized 
1500. 


guantity on hand is 


UFI> select nsneno,toteatveonenhany 
eC from parttidentification 
5 where naxtautheaty*hans >» 1500 
G order dy toteqtytonehand desc 3; 


NSNeNO TOTe¢ QT Y*¢ ONE HAND 
Se el as) 25000 
1342224124111 15900 
751163228332 15000 
e4e1-311<d111 10000 
2512251124511 10000 
2451"°31224115 5000 
LOTS =S12e5 i i2 2000 


7 records selected, 


10 z 


in descending 
for which the 


greater than 


9. Display total guantity on hand and sum of used amount 


for items suprflied by ASAL. 


UFI> select totegqrytonehand, sumteofeusedteunit 
2 3 6from oarteidentification 


30 6Uwhere ASNeENO in 

q ( select nsneno 

5 from documenteidentification 
6 where suoptname = ‘asal')3 


TITe&QTYe¢ONe4HAND SUMeOFeEUSEDEUNTT 


5000 90000 
15000 125000 
10. Find order amount, dates, and shipment 


items which are in the IBASE inventory. 


UFI> gelect anount,dates,shipetyoe 
from parteorder 
ahere rasneno jin 
( select nsnenoeuse 
from uniteinventory 
ahere unitecode = 'Ibpase'); 


CTW” EW 


AMUJUNT JATES SHIP 


2500 031685 air 
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type for 


11. Find required amount and order awount for 
the 2BASE inventory. 
UFI> select raqeamount,annunt 
2 ffom uriteinventory,oarteorder 
3 «shere sniteinventory.unitesode = '2base' 
GQ and UNTCeCINVeNtory.AsSnNenoeuse = oarteorder.ansneno 
REQe AMOUNT AMNUNT 
750 5000 
123: Find total quantity On enandeaca 
amount on hand for items £Or, ~Wwolen 


1W15=512-S12e 


JFI> selece tateagtyeonehand, maxtautheqtyenan4 
e from Darteidentificatioan 
3 ahere 3sneng = °1015<S5t2-Si}32° 


P 
TITe|eQTY+eONe dA ND MAX AUTHEITYe HAVO 


4000 
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iteas in 


=e 


maximum authorize 


nSn_ Ncw 


X¥. CONCIUSIONS AND RECOMMENDATIONS 





An inventory database system is complex and important. 
In order to effectively command and control the inventory of 
an Air Force, the commander must know the status of his 
resources which will present the state of operational 
readiness of the Aix Force. It is difficult to obcktain 
accurate information from the inventory system by using the 
Manual systems. The database management system must be used 
in the inventory systems in order to increase end-user 
productivity, decrease staff,and enable work to be done nore 
efficiently. 

The complex task of a logical database design for a 


relational database Management system can be greatly 


Simplied by use of the Semantic Data Model. SDM iS a 
high-level seman tics-based database description and 
‘structuring formalisn for the database and enhances 


usability of the datatase system. USing the output of SDM in 
the Inventory Database, the records are rearranged in crder 
to fit a relational model. ORACLE DBMS was used for 
implementation. Functionality of ORACLE DBMS is very high 
and provides User Friendly Interface ( UFI ). It is easy to 
use fcr all potential users. 

Inally , database machines are being developed in 
universities and research laboratories. It is obvious that a 
great deal of effcrt is being devoted to developing, 
studyirg and analyzing database coaputers. These efforts 
will result in quality hardware and software for all 


potential users of relational database management systems. 
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